So in AI Agent Builder we can add our own resources like documents and OutSystems entities. This will teach (ground) the LLM. But if we chose ChatGPT, will my data be used outside my application? How can I be sure that my data is not used within the public ChatGPT? This would be critical to know before sending any personal or business data into the system.
Hi @Inge van Gemert,
While OpenAI's policies indicate that customer data is not used to train their models, it is crucial to consider the sensitive nature of certain information that may be processed. Customer data containing confidential or strategic information can indeed represent a competitive advantage. Therefore, it is prudent to adopt a cautious approach. Here are some points to consider:
- Risk of Exposure of Sensitive Data:Even though OpenAI has stringent policies, there is always a risk that sensitive data could be exposed or misused. If this data is accessible to third parties, it could compromise the security and privacy of your business.
- Competitive Advantage:Information that could provide a competitive edge, such as business strategies, customer data, or innovations, should be handled with the utmost care. Using external services to process this data may increase the risk of leaks or misuse.
- Internal Infrastructure:For companies dealing with sensitive data, I strongly recommend considering the implementation of an internal infrastructure to support the LLM (Large Language Model). This ensures that data remains within the controlled environment of the company, with no communication to the outside, thereby minimizing the risks associated with data exposure.
- Total Control Over Data:By utilizing an internal solution, you maintain complete control over the data, including how it is stored, processed, and accessed. This not only enhances security but also allows you to implement specific compliance policies that meet your organizations needs.
- Assessment of Needs:Each organization should evaluate its needs and the sensitivity level of the data it handles. For non-critical data, using external services may be acceptable, but for sensitive information, an internal infrastructure is the best practice.