No Collections Here
Sort your projects into collections. Click on "Manage Collections" to get started
Navigating Cross-Genre SingingStyle Transfer with VAE-GAN Architecture
Voice style transfer has primarily focused on transforming speech from speaker to speaker, leaving singing style largely unexplored. To address this gap, we introduce SingStyleTransfer, a model utilizing a Variational Autoencoder-Generative Adversarial Network (VAE-GAN) architecture to perform style transfer across genres while maintaining the original vocal identity. Using the SingStyle111 dataset, the model has shown success in preserving audio semantic content and producing realistic genre-to-genre transformations. This project sets the stage for more advanced applications in music production, vocal coaching, and AI-driven performance synthesis, demonstrating its potential to personalize and adapt singing styles across various musical contexts.
A Supervised Learning AI Model for Automated Holistic Vocal Performance Feedback
Scalable music education requires giving fast feedback on student audio performances. Current manual feedback mechanisms are given by teachers, rendering them subjective and, therefore, sometimes inaccurate. Current technological feedback mechanisms evaluate whether a student is correct on a single note rather than the entire music piece, not providing cumulative or numerical feedback. An AI model is presented for automatically grading vocal music recordings cumulatively on pitch and rhythm given a reference piece of music. The model predicts a numerical grade for the performance of a reference piece of music. The ML model is then tested for accuracy on a dataset of corresponding audio recordings (performance and reference) and tagged human scores for these performances. Besides demonstrating the feasibility of developing an objective music grading system, the investigation presented in this paper also reveals some important limitations and subjectivity of current music grading systems, opening opportunities for future work in the community.
Academic Papers
The Role of Facial Features and Mannerisms in Detecting Deepfakes
This project seeks to develop a machine learning model to identify deepfakes to prevent the spread of misinformation in this era of technology. Politicians and celebrities are the most affected by deepfakes, since fake videos could endanger their reputation and their careers. Most of the current approaches attempt to create a single model across different videos and using that for detection, which does not yield very accurate results. This study focuses on deepfakes with a single face and attempts to use facial feature extraction for detection of deepfakes. I propose a novel approach of using facial features such as facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation for classification. I conducted 10 different experiments building models for detection using classification algorithms and concluded that 9 of them had an accuracy higher than 95% using the facial feature extraction approach (using OpenFace2).
Cracking Crime with Code: The Applications of AI in Law Enforcement
Crime does not wait for anyone, but what if the solution could anticipate it? The role of
neural networks in our everyday lives has grown exponentially since the first computational models of neurons were made in the 1940s. From phones to cars to finances and medical care, Artificial Intelligence shifts the way we live. In recent years, with more interest and advancement in Generative AI, more people have been using large-scale neural networks for daily tasks, such as for learning about topics or for planning schedules. However, on a greater scale than just being an assistant for normal citizens, AI has the potential to revolutionize law enforcement processes and justice proceedings. Through generating sketches, deepfake video detection, noise distinction detection, data analysis advancements, and voice cloning, there can be huge benefits for crime control from AI. However, even with these advantages and advancements, there can be bias in algorithms, and we need to consider ethical considerations of using AI for crime-solving.
From Cash Registers to Clicks: The Sonic Revolution
In the early 1990s, music shops were pulsating with anticipation as fans flocked to bustling record and CD shops to get their hands on a copy of Nirvana’s newest album, Nevermind. The air charged with the excitement of discovering new music and the familiar “ding!” of the cashier ringing up each sale resonated over the excited chatter. The transaction was a straightforward exchange. One new CD for $15.981. Fast forward to today where the music landscape has undergone a revolutionary transition. The era of physical CDs has given way to a digital age where music is accessible with just a couple of clicks for no cost at all. With the introduction of Napster, iTunes, Soundcloud, and other digital streaming services, music has transformed from a private good to a public good, affecting the way artists profit through streams, and increasing artists' reliance on concerts, media coverage, virality, and merchandise.
Minds and Machines: Tracing the Evolution of Artificial Neural Networks
As society stands at the crossroads of innovation, tracing the path from the creation of Rosenblatt's Perceptron and Parallel Distributed Processing (PDP) to the ever-evolving landscape of current AI, the journey into the origin of Artificial Intelligence becomes truly riveting. The human fascination with understanding the brain and intelligence has consistently inspired endeavors to emulate these phenomena. In the past, scientists have embarked on a path that aimed to model the workings of the human brain. Consequently, a substantial portion of the following research and achievements in this domain emerged from individuals with expertise in psychology, neuroscience, and computer science. The first major discovery relating to neural networks occurred in the early 1940s, with McCulloch and Pitts.