Skip to main content

Winning Createathon Project Uses Large Language Models for Document Tagging

Createathon

The ninth annual Createathon, FINRA’s premiere innovation event, took place over the course of three days in late September. Over 530 staffers participated, including 32 competing teams spanning 11 departments as well as volunteers, DeepRacer competitors, and a dance troupe, to name a few.

Participants were given nearly four months of runway to form a team, create a solution to a challenge, and deliver a functional prototype for the first round of judging on day two of the event. Competing teams were provided with four challenges to choose from: Data Drag Reduction System, Leverage Your Toolkit, Tune Your Brand, and Freestyle: The Racing Line.

The objectives of these challenges included making data more accessible for efficient and accurate analytics, increasing team effectiveness, increasing FINRA’s value for a specific target audience, and solving an existing business problem.

Project NITRO: Neural Integrated Taxonomy Regulatory Optimizer won first place in the Data Drag Reduction System challenge and took home the grand prize. The project was designed to “dramatically speed up taxonomy-based document tagging by combining the rapid processing capabilities of large language models (LLMs) with human expertise, reducing the manual effort required to organize FINRA’s regulatory content.”

The winning team included 11 FINRA staffers, bringing skills from Member Supervision, Technology, the Office of Financial Innovation, and the Office of General Counsel. Ryan Lichtenwalter, Director of Technology, and Mikhail Neychev, Lead Developer, participated on the team.

Ryan and Mikhail were approached by the NITRO team lead, who asked them to join the competition. Ryan explains that the team recognized a challenge at FINRA that intersected with a bigger initiative: development of a FINRA machine-readable rulebook.

“That business problem collided with Createathon,” he said, “where business and technology folks can work together and think more innovatively and freely than our daily jobs would typically allow to come up with a solution.”

Mikhail explains that in the past, there had been manual effort to put labels on FINRA rules to facilitate information discoverability. However, the manual lift was significant and not scalable, so the effort was challenged by the complexity of the required regulatory taxonomy.

“We explored an option of doing this automatically,” Mikhail said. “We found that, although state-of-the-art technologies could provide good direction and some results, it wasn't good enough for our specific use case. There are a lot of peculiarities and a lot of contexts that are not currently available on the general datasets that models are trained on.”

Ryan explains the machine-readable rulebook relates to a field called computational law, which combines law, computer science, and linguistics.

“The idea is that human natural language is infinitely interpretable,” Ryan said. “We want laws and regulations to be clear and unambiguous and accessible to people and, in our case, to our member firms.”

He goes on to express that there can be challenges to understanding when content in the rulebook is related to important regulatory concepts. Language models and natural language technology lack institutional, experiential and contextual knowledge to take what is written and produce optimal results.

“The more that you can enrich the data with ontologies and taxonomies, the more ambiguity is reduced. There is more expert curation and more of a potential benefit to a variety of downstream consumers,” Ryan said. “That's the idea of the machine-readable rulebook. To take concepts from computational law and apply them in FINRA.”

Since winning the Grand Prize, team NITRO has been receiving requests across FINRA to demonstrate their project and consider a range of use cases.

“The discussions around use cases for NITRO are exciting,” Ryan said. “This type of technology could potentially also be applied in different ways over sets of documents to help us understand what a good taxonomy would be – so there are many directions this could take.”

Both Ryan and Mikhail expressed their appreciation for how Createathon provided them with the opportunity to work closely with FINRA peers outside of their regular scope.

"I really liked working with different people on the team,” Mikhail said, “It felt really different from what you normally see when you're working on a normal project. Createathon brought a special atmosphere and working style. It’s a completely different story when we, business and technology, could work much closer together and outside of regular project schedules."

At the same time, both Ryan and Mikhail express gratitude and appreciation for Createathon itself.

“There is so much value in this type of rapid development and prototyping work, for people to get into the same room and with very different people across FINRA and work together as a team,”Ryan said. “That's a very cool and different dynamic that is, I think, incredibly valuable and one of the great benefits of Createathon to FINRA.”

Following every Createathon, all projects – not only the winning ones – enter into the FINRA Research and Development Program pipeline, where they are tested for feasibility. Learn more about FINRA’s R&D Program here.