Can you have too much data for an AI application? In the mad dash to collect the raw material for AI applications, it can be tempting to pull in as much as you can. Product manager Emily Jasper returns to the podcast with a set of recommendations for more strategic use of data with host Eric Hanselman. Just as it might not be wise to load up on everything on a buffet, being strategic about using the data that best suits the goals of your project can improve outcomes and help to manage risk. By understanding the data that you’re putting to work, you can bound the universe of outcomes and simplify the process of bringing it into the AI application pipeline. At the same time, the process of data governance becomes clearer when the sources are better understood.
Bringing an understanding of the set of data resources that an enterprise has is critical and has to be accompanied by knowledge of the quality of that data. The principles of library sciences are back in focus in AI, as organizations work to curate data characteristics and provenance. As in so much of AI, matching the ecosystem of tools, data providers, and capabilities to the use cases being built is fundamental to project success. Managing risk in AI has become a process of bringing the right data to the right problem.
More S&P Global Content:
For S&P Global Subscribers:
Credits: