Earlier this year, Databricks launched Dolly 2.0: the world's first truly open instruction-tuned Large Language Model (LLM). To build off this excitement around LLMs, Databricks hosted two hackathons, one virtual, and one in-person, to kick off the annual Data+AI Summit. Data teams from around the globe came together to build unique use cases, applications, and techniques showcasing LLMs. Between the two events, we had over 1,000 participants with over 70 projects submitted!
Let's take a moment to showcase the winning projects and highlight the fantastic teams who created them.
Virtual Hackathon Winners
Our virtual hackathon was held prior to Data + AI Summit in collaboration with the Devpost team (special shout-out to Michelle Brain for helping make this such a huge success). Participants had 30 days to submit their projects and could work in a team of up to four people. Check out the full project gallery here.
First Place: DataDM, your open source private data assistant.
Gus Eggert, Mike Biven, Justin Waugh, Kevin Huo
DataDM is a chatbot interface where users talk to an AI assistant that writes code, which is then executed to answer data questions. Users can ask questions about data processing, feature engineering, data cleaning, question answering, visualizations, and even some data science modeling. They can bring their own CSVs (which are local and private) or can easily find CSVs via the GitHub API, both of which are easy to do with a single click add button.
Second place: Cancer-Rx-Approve, write letters for cancer drug denials.
Mitchell built Cancer-RX, a cloud-based web application that clinicians can use to generate letters to send to insurance companies when requests for chemotherapy are denied. The LLM synthesizes information from the patients' medical records and recent drug research to create a compelling letter. These letters would both save clinicians valuable time and reduce the time patients need to wait to receive their life-saving cancer care.
Third place: Fight Health Insurance, Automatically generated health insurance appeals.
Holden developed Fight Health Insurance, which generates health insurance appeals to lower the barrier to entry for individuals who want to appeal insurance claim denials. Fight Health Insurances combines the output of multiple models (dolly-v2-12b, BioGPT, etc.) with California's public insurance data to generate a (low quality) synthetic dataset for fine-tuning Falcon 7B onwards to generate health insurance appeals.
Honorable mentions
Data + AI Summit 2023 Hackathon Winners
Kicking off in-person, Data + AI Summit 2023 attendees joined in on the fun by attending an all-day hackathon centered around LLMs. Participants had six hours to submit their projects and could work in a team of up to four people. Shoutout to MLH and the team, Alba King, Fiona Whittington and Paul Horton, for making sure the attendees had a fun and meaningful experience.
First place: SchemaSpy, Navigating Database Schemas with AI Assistance.
by Sai Pradeep Peri, Cameron Hutchison, Khushboo Breja, and Rik Bauwens
Second place: Pharma Assist, Helping people with conditions that require medication create, maintain and stay on schedule for taking their medication.
by Joshua Bowron, Richard Ryman, Himanshu Grover, and Max Mogenis
Third place: Commit-With-A-Bang, an AI-Powered Pull Request (PR) engine for effortless and insightful code management.
by Dennis Aguillon, Patrick Young, Priya Prakshal, and Prakshal Jain
Huge congratulations to these creators! Thank you for inspiring us to continue to keep building and growing together, and for showcasing the powerful impact LLMs can have across use cases.
Want to learn more about using and building your own Large Language Models? Check out Databricks' expert-led EdEx courses!