04/13/2023
We are serious about BBQ at SAIGE. A home party at Minje’s place.
SAIGE is a group of researchers working on machine learning algorithms for signal processing problem
04/13/2023
We are serious about BBQ at SAIGE. A home party at Minje’s place.
03/10/2023
Two additional presentations at ! Our recent journal papers on "self-supervised learning for personalized speech enhancement (IEEE JSTSP, led by Aswin Sivaraman)" and "AdaBoost-based hashing for efficient speech enhancement (IEEE TASLP, led by Dr. Sunwoo Kim)" were accepted for presentation at .
02/17/2023
SAIGE members (Anastasia Kuznetsova, Haici Yang, Darius Petermann, Aswin Sivaraman, and Minje Kim) authored four accepted papers for publication at . Kudos to the authors! See you all in Rhodes Island, Greece!
11/07/2022
SAIGE welcomes new members with warm food and shirts. Photo was taken in Aug and they are well into their research projects already.
08/24/2022
We care a lot about the efficiency of AI models for their use on devices. Our latest effort in this area was published in IEEE Trans. on ASLP, where we proposed a hashing-based method for speech enhancement/source separation. Our algorithm learns the efficient and effective binary representations that are used to perform speech denoising in a bitwise, i.e., hardware-friendly fashion. The paper also provides a comprehensive view of our method from the perspective of the kernel method and an interpretation as a neural network model. Please check out our paper:
Demo and source code: https://saige.sice.indiana.edu/research-projects/bwss-blsh/
IEEE Xplore: https://ieeexplore.ieee.org/document/9053052
Author version (pdf):https://saige.sice.indiana.edu/wp-content/uploads/taslp2022_skim.pdf
Learning to Hash for Source Separation – SAIGE@IU Learning to Hash for Source Separation We have cared much about the efficiency of the machine learning inference process. As a part of this effort, we recently came up with a hash code-based source separation system, where we used a specially designed hash function to increase the source separation....
05/04/2022
Please check out Darius Petermann's cool presentation on SpaIn-Net, a music source separation model that is mindful of the instruments' spatial locations. SpaIn-Net is robust even if the spatial information is not precise. ;) https://iu.mediaspace.kaltura.com/media/t/1_mboimmw7
04/25/2022
At SAIGE we are dead serious about source separation but in one of our new papers, we suggest you "don't separate, learn to remix" if you want just a remix. In this end-to-end model with joint optimization for separation and remixing, we propose a novel interactive "remixing" system. Yes, the point is, not to focus on source separation too much unless it's absolutely necessary. Please take a look at Haici Yang's virtual presentation for more information. This was a result of an exciting collaboration with Nick Bryan at Adobe Research.
02/18/2022
SpaIn-Net is a spatially-informed network for music source separation. It takes the user's rough guess about the stereophonic location of the musical instrument as input and does better separation. More details, demo, source codes, and our paper about the SpaIn-Net project are here: https://saige.sice.indiana.edu/research-projects/spain-net/
02/15/2022
For we named one of our new deep learning models after our beloved hometown, Bloomington, IN! In this paper, we present "BLOOM-Net" that flexibly scales its architecture to fit from small to large devices, while it always retains optimal speech enhancement performances. More details on this open-sourced project can be found here:
BLOOM-Net: Scalability Matters – SAIGE@IU BLOOM-Net: Scalability Matters Scalability is a big deal when it comes to video coding. When you watch a movie via a streaming service on Friday night, the video quality fluctuates—it’s the video codec’s effort in providing the maximum video quality even though your internet connection suffers...
Our T-ASLP paper is officially published (led by Kai Zhen)! It's a comprehensive consolidation of our neural speech coding projects with lots of in-depth insight and new discoveries. Check out our paper if you are curious how deep learning can be used for speech/audio coding.
IEEE Xplorer: https://ieeexplore.ieee.org/document/9622124
Authors' PDF:https://saige.sice.indiana.edu/wp-content/uploads/taslp2022_kzhen.pdf
SAIGE members authored SEVEN papers accepted for publication at ICASSP 2022. Great team work among the SAIGE members as well as exciting external collaboration! We are all sincerely hoping to be there in Singapore in person this year!
Sunwoo Kim, Minje Kim, "BLOOM-Net: Blockwise Optimization for Masking Networks Toward Scalable And Efficient Speech Enhancement"
Darius Petermann, Minje Kim, "SpaIn-Net: Spatially-Informed Stereophonic Music Source Separation"
Haici Yang, Shivani Firodiya, Nicholas Bryan, Minje Kim, "Don't Separate, Learn to Remix: End-to-End Neural Remixing with Joint Optimization”
Haici Yang, Sanna Wager, Spencer Russell, Mike Luo, Minje Kim, Wontak Kim, "Upmixing via Style Transfer: a Variational Autoencoder for Disentangling Spatial Images and Musical Content"
Hao Zhang, Srivatsan Kandadai, Harsha Rao, Minje Kim, Tarun Pruthi, Trausti Kristjansson, "Deep Adaptive AEC: Hybrid of Deep Learning and Adaptive Acoustic Echo Cancellation"
Aswin Sivaraman, Scott Wisdom, Hakan Erdogan, John R. Hershey, "Adapting Speech Separation to Real-World Meetings Using Mixture Invariant Training"
Darius Petermann, Gordon Wichern, Jonathan Le Roux, Zhong-Qiu Wang, "The Cocktail Fork Problem: Three-Stem Audio Separation for Real-World Soundtracks"
11/15/2021
A late fall excursion to Yellowwood State Forest.