Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using umlauts in the CSR subject break idempotency #270

Closed
simontunnat opened this issue Sep 10, 2021 · 8 comments · Fixed by #271
Closed

Using umlauts in the CSR subject break idempotency #270

simontunnat opened this issue Sep 10, 2021 · 8 comments · Fixed by #271
Labels
bug Something isn't working

Comments

@simontunnat
Copy link

simontunnat commented Sep 10, 2021

I have issues using umlauts in a CSR.

Example:

- name: "Create certificate signing request"
  community.crypto.openssl_csr:
    country_name: "DE"
    state_or_province_name: "Baden-Württemberg"
    ...

Leads to the following message when running the play for the FIRST time:

TASK [Create certificate signing request] ******************************************************
--- before
+++ after
...
+    "subject": {
+        "countryName": "DE",
+        "stateOrProvinceName": "Baden-W\u00fcrttemberg"

Leads to the following message when running the play for the SECOND or THIRD time:

TASK [Create certificate signing request] ******************************************************
changed: [somehost]

After this the CSR file has the same content but the last modified timestamp has changed.

This issue seems to be encoding related. When I encode the string "Baden-Württemberg" as "Baden-W\xC3\xBCrttemberg" in my play it works as intended.

@Ajpantuso
Copy link
Collaborator

Hmm, this seems like a broader issue with the current approach to comparing Subject attributes.
Currently a fairly simple text comparison is performed by way of a set comparison of the name, attribute pairs as tuples.
However the actual rules for comparison are more complex and from what I saw in PyOpenSSL and Cryptography initializing Name attributes from the module parameters does not improve the situation as those libraries are also performing text comparisons on values.

In this particular case the issue can be addressed by applying normalize('NFC', <value>) from unicodedata on the UTF8String fields before comparing to the existing encoded values.

From RFC 5280

When the UTF8String encoding is used, all character sequences SHOULD be normalized according to Unicode normalization form C (NFC) [NFC].

However I think there are more transformations that should be applied for comparison.
@felixfontein thoughts?

@felixfontein
Copy link
Contributor

I was only able to replicate this with the cryptography backend and Python 2.7. For PyOpenSSL or Python 3, this doesn't seem to be a problem. The main issue here is that unicode strings are compared with byte strings in this specific case. In Python 3, all strings involved are unicode strings, and the PyOpenSSL code converts everything to byte strings before comparing (at least for the subject).

@Ajpantuso I don't think normalization is the problem here, and in fact one can argue that it shouldn't be, since the module's job is not necessarily to do normalization, but to put into the CSR what the user provides. (Though I guess one could provide (an) option(s) to allow to configure this behavior, both for processing input and for normalizing for comparisons.)

@Ajpantuso
Copy link
Collaborator

Ah, that makes sense. My concern was that Unicode was being normalized by OpenSSL so user input would lose idempotency over multiple runs in which case normalization during comparison would help. Not sure if the trick of filtering the option value through to_json in the task definition would help here.

@felixfontein
Copy link
Contributor

I would expect OpenSSL to simply ignore this issue (especially when used as a library). If OpenSSL would do normalization itself that's not required by the RFCs, this would make it impossible to do certain (valid) things with it. Also as you mentioned, Unicode is complicated, so trying to handle normalization on such a low level is a recipie for disaster ;-) (Same for PyOpenSSL and cryptography, they should rather avoid that matter and leave it to the application level.)

In any case, I don't think to_json will do any normalization. It should only handle encoding, but not normalization. I guess it would be useful to have some unicode normalization filters for Ansible, though. But they would rather belong in community.general (or even core) than here :-)

@felixfontein
Copy link
Contributor

I created #271 to fix this issue. (It also comes with some tests :) )

@simontunnat
Copy link
Author

simontunnat commented Sep 10, 2021

Thank you very much for the fast feedback and already creating a PR.

I will be able to check if this works for my use case on monday. :)

@simontunnat
Copy link
Author

The PR works as intended. :)

When might this change be released through Ansible Galaxy?

@felixfontein
Copy link
Contributor

felixfontein commented Sep 14, 2021

I'll create a 1.9.3 release later today.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants