Updates on the Morfyk Trilogy Writing a novel is hard. Like, getting a second Ph.D kind of hard. Writing three novels at the same time is a fool's errand. So, you can label me a fool. I started in May 2022 and now it's June 5, 2025. I sent the
Federated Learning and GANs - The Lost Chapter Overview The below is a chapter that I ended up not publishing in my final PhD dissertation because the nature of the research - proposed during my proposal phase a year before - was too complex. The thrust of my dissertation research focused on GANs writ-large and multimodality. But adding
My first Kaggle Challenge (2018) I recently started digging through my Google Drive and found a trove of old papers I wrote. One article I wrote is particularly entertaining. It must have been the first time I competed in a Kaggle Challenge and it was an image classification problem of dog breeds! I'm
An Overview of Image Registration Image registration is a standard technique in computer vision, defined in the following ways: “directly overlaying one image on top of another” (Reddy et al), “aligning two or more images” (De Vos et al), and the “process of transforming different image datasets into one coordinate system with matched imaging contents”
Multimodal Representations Dear Reader: I came across a paper I had started drafting in 2019 on Multimodal Machine Learning - at the time a very new and fascinating field that I was considering to apply towards my Ph.D. Lo and behold, here we are nearly six years later and my dissertation
Value of multimodal networks In natural language processing (NLP) researchers use one type of data which is text, in order to perform tasks such as text classification, question-answering, or language generation. Whereas, in computer vision algorithms like support vector machines (SVM) or convolutional neural networks (CNNs) use image data extracted from static images or
The Good Ol' CNN Dear Reader: I recently found a trove of papers I wrote back in 2019 and thought to post a few snippets! Convolutional Neural Networks (CNN) gained popularity in 2012 when AlexNet developed by Alexander Krizhevsky, Sustkever, and Hinton won that year’s ILSVRC (ImageNet Large-Scale Visual Recognition Challenge) classifying images
What is Multimodal AI, anyway? Clearly, we see this MMAI everywhere nowadays. Enter a prompt in Midjourney and you now get an image. From VQGAN-CLIP who had released her code several years ago to offer the first example of using text prompts to generate images, to, RunwayGen that makes videos (like this 4-sec clip I
A Beginner's Quick Brief to Model Learning Once I deciphered how a neural network operates and was able to diagram and code it by hand using nothing by numpy, I was hooked. It was as if I finally broke through and could interpret all the complex diagrams I saw on the internet. Shortly after I came to
Easy Statistics for Model Evaluation Statistics is the tried and true way of evaluating the performance of a machine learning model. I wrote the material below about two years ago when I was in the thick of my PhD. It's a basic overview of many terms you'll hear over and over