While in university, my research focused primarily around Natural Language Processing. Specifically, I worked on text generation and query-focused techniques under Professor Laura Dietz as part of the Text Retrieval, Extraction, Machine Learning and Analytics (TREMA) lab.
As an undergraduate, I worked on a project called Impromptune, which was a machine learning approach to writing classical music. This project is not directly connected to NLP, but many of the techniques and ideas used are shared. The paper itself is available here. Also available is a sample of music generated by my model.
My graduate research shifted from music to text, where I worked on information ordering and summarization. Two main projects came out of this work, one focused on the ordering of information in a query-specific context, and one focused on summarization of documents discovered via an Information Retrieval system. Both of these are discussed further in my thesis, which is also available online.
My work on summarization is centered around a multistage approach to abstractive multidocument summarization for article generation. While recent models perform very well on single document summarization, the large amount of text present in document rankings is often too large for these models to process. Instead, we extract information from each document independently, and then recombine these facts into a coherent article via a second pass of summarization. In addition, we limit the amount of documents considered at one time by clustering documents into subtopical clusters, which gives us the added benefit of organizing our output article. I think it writes pretty well, but you’re free to judge it for yourself.
An interesting result of this approach to article generation is the traceability of the provenance of each output statement. This allows for a more interpretable output, or at the very least a means of verifying the facts in the article.
Before I got involved with NLP, I did some work on signal processing and classification, where I focused on arrhythmia detection and magnetic wave perturbations (they’re more similar than you’d think!).