Check YAML Dictionary Key Uniqueness with PyYAML

January 13, 2024dustin Leave a comment

If you are dealing with very large YAML files that are curated over time by hand, it is not inconceivable that someone will inadvertently introduce a duplicate. The problem is that, as PyYAML is just dutifully enumerating the nodes and loading a dictionary and has no requirements, knowledge, or authority to do anything else, you will already have lost the duplicates by the time you’ve received the dictionary back.

This implementation is only concerned with dictionaries under duplicate keys, and not integers, strings, lists, etc.. under duplicate keys. Note the corresponding comment. It was unnecessary in my situation and something you’ll have to account for if/when modifying this routine for your purposes.

The following code overrides the YAML loader and the map construction to do this. This source-code is also available as a gist.

import yaml

def load_and_assert_uniqueness(x):

    # We'd like to detect duplicates. Since PyYAML both loads things depth-first
    # *and* doesn't give us the parent when processing a child node, we'll index
    # of all of the object IDs as we're constructing them, and then see which
    # are disappeared from the final hierarchy. Since all we can do is pass a
    # class, we need to inline the class in order to load into an index within
    # our scope.
    #
    # We're only concerned about dictionary keys with dictionary values because
    # a) this is our use-case, and b) we can stash additional information as
    # dictionary keys without having to override any types.

    nodes_by_id = {}


    class _UniqueCheckedLoader(yaml.SafeLoader):

        def construct_yaml_map(self, node):
            data = {}

            id_ = id(data)
            data['_id'] = id_
            nodes_by_id[id_] = data

            yield data

            value = self.construct_mapping(node)
            data.update(value)


    _UniqueCheckedLoader.add_constructor(
        'tag:yaml.org,2002:map',
        _UniqueCheckedLoader.construct_yaml_map
    )


    # Load

    blob = yaml.load(x, Loader=_UniqueCheckedLoader)


    # Remove all nodes in the final dictionary from the by-ID index

    q = [blob]
    while q:
        d, q = q[0], q[1:]

        id_ = d.pop('_id')
        del nodes_by_id[id_]

        for v in d.values():

            # We're only concern with dictionary nodes
            if v.__class__ is not dict:
                continue

            q.append(v)


    # We've visited all referencesd nodes. Everything still indexed must've been
    # pruned due to nonuniqueness. As mentioned above, we really don't have any
    # hierarchical context, by we can just search out occurrences of the
    # attributes from the node(s) in the data in order to find the duplicates.

    if nodes_by_id:

        # Cleanup representation before displaying

        nodes = []
        for node in nodes_by_id.values():
            del node['_id']
            nodes.append(node)

        # Error out

        raise \
            Exception(
                "({}) nodes were duplicates:\n{}".format(
                    len(nodes), nodes))


    return blob

Embedding Python in PostgreSQL Functions

July 15, 2023dustin

Example:

CREATE OR REPLACE FUNCTION url_quote (url text)
RETURNS TEXT
AS $$
    from urllib.parse import quote
    return quote(url)

$$
LANGUAGE 'plpython3u';

SELECT url_quote('https://www.postgresql.org/docs/12/plpython-data.html#id-1.8.11.11.3');

Getting Started with Postgres Functions in PL/Python

How to Render Django Templates Without Loading Django (or Its Configuration System)

June 8, 2023dustin

#!/usr/bin/env python3

import django.template
import django.template.engine

def _main():
    e = django.template.engine.Engine()

    body = """\
aa {{ test_token }} cc
"""

    t = django.template.Template(body, engine=e)

    context = {
        'test_token': 'bb',
    }

    c = django.template.Context(context)
    r = t.render(c)

    print(r)


_main()

Output:

$ ./test_render.py 
aa bb cc

AWS: Adding a new MFA device says “This entity already exists” or “MFA device already exists”

November 28, 2022dustin

A team-member was trying to register a new MFA device in AWS, and was being told that they already had one registered:

However, their account claims that none are registered:

However, it looks like AWS might show an empty list when it shouldn’t when the user has started the process but was interrupted from completing it. Use the AWS CLI “list-virtual-mfa-devices” subcommand to enumerate the current MFA devices:

$ aws iam list-virtual-mfa-devices
{
    "VirtualMFADevices": [
        {
            "SerialNumber": "arn:aws:iam::326764833890:mfa/karan"
        },
        {
            "SerialNumber": "arn:aws:iam::326764833890:mfa/rachel"
        },
        {
            "SerialNumber": "arn:aws:iam::326764833890:mfa/sarah.benhart"

Now, remove the problematic one using the corresponding SerialNumber value:

$ aws iam delete-virtual-mfa-device --serial-number <SerialNumber value>

You will now be able to restart the process with them. Make sure to have them remove any existing entries in their app so they don’t get confused.

Run WASM Applications As Containers Directly in Docker

November 15, 2022dustin

There’s really no simple, tangible blurb to share, but it’s pretty awesome (though still in beta):

https://docs.docker.com/desktop/wasm

Git: List branches sorted by time

July 15, 2022dustin

To list all of your branches, but sort them by last change in descending order:

$ git branch --sort=-committerdate

Dell XPS 17: The Worst Support That Money Can Buy

April 18, 2022dustin

Dell’s warranty process is an expensive ruse.

I have a Dell XPS 17, which has an 8-core 2.4GHz I9 CPU, 2TB SSD, 64GB memory, GeForce RTX 2060 GPU, and 17″ 4K screen in a very manageable 15″ package (very thin edges on the bezel). All of that power, and I still get about four or five hours on the battery as long as I am prudent with the screen brightness, fan, and attached accessories.

It has a good weight while still being very thin and having an efficient, medium-sized transformer block. I paid thirty-five hundred for it and several hundred more for the warranty. It is the most expensive laptop they carry. It is one of the most expensive laptops on the market. I am a very resource-intensive developer who is writing code for at least fifteen-hours on most days. I had just become sick of losing time due to insufficient resources. I wanted to have more space then I usually needed, more memory then I usually needed, and more speed then I normally needed, for once, while sticking to Linux (and just generally avoiding Apple).

Five months later, I noticed that I had somehow ended-up with the glass crushed at the bottom-right of my screen. I have no idea how it happened. Maybe uneven stresses on the glass near the hinge as I opened and closed the lid over time. Dell offered to send me a technician to replace the LCD. I remembered doing this in college several times (they would come to my dorm room) and I had had a good experience. That way I did not lose the laptop for a week or two while also avoiding the possibility of them insisting on replacing it before I had a chance to fully back it up. The technician was third-party, since Dell does not do it themselves. This might potentially add a layer of complexity but is otherwise fine.

The technician remarked about how densely-packed the laptop was, which led me to believe that he had never worked on any XPS before. The screen stopped working a few minutes after he left. It took another week for them to send another technician, whom never showed up. When their dispatch attempted to schedule another technician, they said that the previous technician had already claimed the box with the replacement screen, disappeared with it, and could not be found, so they had no part to send with another technician. It took me another two weeks to get another technician, who then fried my motherboard while he was installing the new screen.

I then spent several additional weeks of explaining and reexplaining the situation to their support department, which sucked over the phone, sucked over the website chat, and sucked slightly less via Whatsapp (it at least did not require me to keep the website open, and seemed to be a little more responsive). Every day, I got a different support person and a different manager from the other side of the planet. They would call me to confirm details, I would ignore their call because I would have no idea that one of the spam calls that I routinely got every day was actually Dell calling from India at 10pm, and there existed no way for me to return their phone calls. I was buried in bureaucracy. Finally, after telling me that they wanted to replace the whole laptop, they got an approval a week later. I had it within two or three days after that. Though, they ended-up sending me a refurbished laptop. It had a sticker on the bottom that said “refurbished”, which came apart when I tried to remove it. Awesome.

After the first week or two, I had ended-up buying an alternative, emergency high-end Lenovo laptop because I was a business owner who needed to travel and needed a functioning laptop. Dell was failing me on a pretty major level and I could not afford to be exposed to them ever again.

I had started with, basically, a new laptop, there was damage through normal usage that I could not be held responsible for, and I ended-up with a used laptop after losing a couple of months and part of my soul to their support process. Plus, I needed an extra high-end laptop from a competitor just to account for their incompetency in the future.

The XPS is a gorgeous laptop, with only the occasional touchpad problem, and yet the lack of respect Dell showed me both via their support process (an army of separate representatives and inexperienced support personnel) and quality of their warranty (refurbished replacement) makes it a terrible investment. A four-thousand dollar investment that my company depends on and no peace of mind to speak of.

Python: Substitute Values Into YAML During Load

April 1, 2022dustin

There’s definitely a case to be made for automatically and efficiently applying environment variables or another set of replacements into a YAML-based config or set of instructions on load. This example uses PyYAML. We’ll use Python’s built-in string templating to replace tokens like “$NAME” with values from a dictionary. It will fail, as it should, if the name is not in the given dictionary.

import string
import yaml

def load_yaml(f, context):

    def string_constructor(loader, node):

        t = string.Template(node.value)
        value = t.substitute(context)

        return value


    l = yaml.SafeLoader
    l.add_constructor('tag:yaml.org,2002:str', string_constructor)

    token_re = string.Template.pattern
    l.add_implicit_resolver('tag:yaml.org,2002:str', token_re, None)

    x = yaml.load(f, Loader=l)
    return x

y = """\
aa: bb
cc: dd $EE ff
"""

context = {
    'EE': '123',
}

d = waw.utility.load_yaml(y, context)
print(d)

Output:

{'aa': 'bb', 'cc': 'dd 123 ff'}

Was It Actually Possible For A Nuclear Bomb/Explosion To Set The Atmosphere On Fire?

March 11, 2022dustin

Actually, no. In summary, two of the reasons are that so much of the energy are lost into radiation and light.

It is shown that, whatever the temperature to which a section of the atmosphere may be heated, no self-propagating chain of nuclear reactions is likely to be started. The energy losses to radiation always overcompensate the gains due to the reactions.

It is impossible to reach such temperature unless fission bombs or thermonuclear bombs are used which greatly exceed the bombs now under consideration. But even if bombs of the required volume (i.e., greater than 1,000 cubic meters) are employed, energy transfer from electrons to light quanta by Compton scattering will provide a further safety factor and will make a chain reaction in air impossible.

Later:

If, after calculation, [Compton] said, it were proved that the chances were more than approximately three in one million that the earth would be vaporized by the atomic explosion, he would not proceed with the project. Calculation proved the figures slightly less — and the project continued.

In later studies, it was determined to actually be impossible.

Reference: https://www.insidescience.org/manhattan-project-legacy/atmosphere-on-fire

Installing VMWare Remote Console (VMRC) On Arch/Manjaro

June 19, 2021dustin

When you download the Remote Console bundle installer from VMWare and run it, the UI will start and simply fail with “Installation was unsuccessful” error. There is no console output, seems to be no log, and no apparent option to enable verbosity. Since I’m using Manjaro, I’d rather not use the package, which is only available in AUR.

A successful install is not terribly difficult. You can start the install using a command-line installer rather than the GUI:

sudo ./VMware-Remote-Console-11.2.0-16492666.x86_64.bundle --console

There will be some prompting that isn’t present in the GUI. When it asks you:

System service scripts directory (commonly /etc/init.d).:

…enter an alternative directory. That’s it. In my case, it didn’t even have to exist.

The install will take a minute expanding things, and then deposit the “vmrc” executable within the executable search-path. The browser should now be able to find it when the remote console wants to open the viewer.

Random Engineering

Gotta figure that out.

Widgets

Search

Check YAML Dictionary Key Uniqueness with PyYAML

Embedding Python in PostgreSQL Functions

How to Render Django Templates Without Loading Django (or Its Configuration System)

AWS: Adding a new MFA device says “This entity already exists” or “MFA device already exists”

Run WASM Applications As Containers Directly in Docker

Git: List branches sorted by time

Dell XPS 17: The Worst Support That Money Can Buy

Python: Substitute Values Into YAML During Load

Was It Actually Possible For A Nuclear Bomb/Explosion To Set The Atmosphere On Fire?

Installing VMWare Remote Console (VMRC) On Arch/Manjaro

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this: