Hi everyone! Thought I’ll update about what I’ve been upto, and how the progress looks like. This week was a little slow in terms of writing codes, but that’s the fun part - half of the time is spent (also: should be) on discussions, getting consensus on your thoughts and discussing options available. I think I have had a fairly great experience doing that, my mentor Andrew helped me push forward to discuss my thoughts openly on GitHub discussions, and I think it was a great learning experience for me.

One of the Moja Global’s community member, Padmaja had a PR which implemented pre-processing for GCBM simulations, and I had great time learning by reading through the PR. I should mention this, that I realized how important the PR descriptions and review comments are for others who want to get the context around anything that was done. Just look at the PR I shared above, it helped me get the context around the PR through the great description and review comments by the reviewers and the author.

I tried reproducing results, had to switch to Manjaro distribution from Pop OS - and I couldn’t reproduce it on Manjaro (some failures), so that took some time, but still a great experience to try debugging and discussing with others about it. There was an active discussion on how the design for the endpoints should be, and I got to learn a lot from there as well. Presented my ideas and thoughts here.

My mentor, Andrew had shared the pseudocode with me to help me unblock from the issues while running the end-points, and I think that helped a lot. Sometimes I feel, just writing how to implement instead of spending a lot of time on designing stuff, helps. And that’s what I did, thanks to the motivation from Andrew. Here is the repo I made to share it with others and get first round feedback: https://github.com/ankitaS11/gcbm_preprocessing. I’m thinking towards having classes with format: “GCBM”, something like: GCBMList which provides useful overloads to the list class of Python. Following is the base class:

class GCBMList:
    """
    This is a base class for GCBM pre-processing scripts to use. It prevents users to do: <config>.append(<anything that is not a file>)
    """
    def __init__(self, category=None):
        self.data = list()
        self.config = list()
        self.category = category

    def __iter__(self):
        return self.data

    def __getitem__(self, idx):
        return self.data[idx]

    def is_category(self, path):
        if self.category is None:
            raise NotImplementedError("Please implement `is_category` method, which is used by append() method")
        else:
            return self.category in path

    # Unlike list.append() in Python, this returns a bool - whether it was successful or not
    def append(self, file_path):
        if self.is_category(file_path):
            self.data.append(file_path)
            return True
        return False
    
    def update_config(self):
        raise NotImplementedError("Need a `update_config` method here.")

    def generate_config(self):
        raise NotImplementedError("Need a `generate_config` method here.")

    @staticmethod
    def change_extension(file_path, new_extension):
        # TODO: let's use pathlib.Path everywhere, for now it's okay here
        pathlib_path = pathlib.Path(file_path)
        return pathlib_path.with_suffix(new_extension)

I personally liked writing this implementation. :)

Well, that’s mostly what I had to share today, and I’ll come back soon, hopefully with more progress. Till then, take care everyone, see you in the next blog <3