-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CAI-118] Presidio #1183
[CAI-118] Presidio #1183
Conversation
🦋 Changeset detectedLatest commit: 028b827 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
apps/chatbot/src/modules/models.py
Outdated
@@ -22,7 +22,7 @@ | |||
assert PROVIDER in ["aws", "google"] | |||
|
|||
|
|||
GOOGLE_PARAM_NAME = os.getenv("CHB_GOOGLE_API_KEY") | |||
GOOGLE_PARAM_NAME = os.getenv("GOOGLE_PARAM_NAME") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The environment variable has been previously named CHB_GOOGLE_API_KEY to be compliant with the standard we used in the others
GOOGLE_PARAM_NAME = os.getenv("GOOGLE_PARAM_NAME") | |
GOOGLE_PARAM_NAME = os.getenv("CHB_GOOGLE_API_KEY") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. Anyway, we need a better way to share the env variables
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have the .env.example file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean a shared env file with all the filled variables
apps/chatbot/src/modules/presidio.py
Outdated
self.nlp_engine = nlp_engine | ||
self.analyzer = AnalyzerEngine( | ||
nlp_engine = self.nlp_engine, | ||
supported_languages = ["it", "en", "es", "fr", "de"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it actually necessary to support other languages than italian?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. There isn't anymore a block language, so a user can write using any language. Presidio masks PII entites only if the language is one of its inputs (see detect_pii
method). So, I assumed that a normal PagoPA user could speak one of the main european languages.
apps/chatbot/src/modules/presidio.py
Outdated
try: | ||
lang_list = detect_langs(text) | ||
for i in range(len(lang_list)-1, -1, -1): | ||
if lang_list[i].lang not in ["it", "en", "es", "fr", "de"]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see previous comment
Branch is not up to date with base branch@christian-calabrese it seems this Pull Request is not updated with base branch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Jira Pull Request LinkThis Pull Request refers to the following Jira issue CAI-118 |
List of Changes
Add presidio to chatbot module to mask the Personally identifiable information (PII) entities
Motivation and Context
In this way, we can store a user's conversation masking all the PII in it for privacy reasons.
How Has This Been Tested?
Jupyter notebook
Screenshots (if appropriate):
Types of changes
Checklist: