Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is a unix socket path we can add different span attributes
opentelemetry-python-contrib/instrumentation/opentelemetry-instrumentation-redis/src/opentelemetry/instrumentation/redis/util.py
Lines 44 to 47 in fbb677a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case the
host
attribute will be the path of the socket (which aligns with the way psql takes it as a parameter).I am open to differentiating it though if
NET_PEER_NAME
is not an accurate representation in that context.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like
NET_PEER_NAME
should be the path to the socket:If we are setting
NET_PEER_TRANSPORT
anywhere it should beunix
.With the values we are setting now I think this should be okay, unless I'm missing a spot where other
net.*
values get set.Edit: Sorry, misread! I think we are good on
NET_PEER_NAME
but you are right, the transport could be set as well.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Let's add NET_PEER_TRANSPORT conditionally when connecting via unix socket.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is absence of
port
guaranteed to be a unix socket? could it be something else?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@owais that's a great point.
It looks like the
TypeError
could apply to non-socket cases too.However, there is
cur.connection.info
which provides:When connecting to a socket it provides:
psycopg2
makes the assumption that every socket is named.s.PGSQL.{port}
, meaning it only stores the absolute path of the socket directory. Mirroring that assumption here seems risky.The
info
object seems like the safer bet since it's always provides certain values.If you initialize a SQLAlchemy engine with
engine = create_engine('postgresql://')
the.dsn
attribute will be an empty string.I think at the very least the presence of a filesystem path in
info.host
is a clue we are using sockets... but I'll dig around thepsycopg2
source a bit to see if I can find any better guarantees.Edit: That being said, it seems like there is a bug present now that people will encounter if they aren't setting an explicit port (ie. relying on the default). The other function that sets attributes simply omits attributes that aren't present. Worth splitting the work into a bugfix and something that improves the attributes?
Edit 2: Mirroring the
.s.PGSQL.{port}
convention to construct the absolute path seems fine actually, that's a convention enforced by PostgreSQL itself. And since socket support isn't available on Windows a simple check for'/'
at the start of the host should be enough to determine whether we are using a socket or not.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Checking if the host starts with
unix://
,unix:///
or/
makes more sense to me than checking for absence of port.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed up a change that implements the
NET_TRANSPORT
logic for sockets, although I found there is unreachable code.I wrote a quick test locally.
The
_get_attributes_from_cursor
function is only called ifhost
is not in the sqlalchemy URL.This is the case for all unix socket connections (for postgres).
On the other hand,
host
is always present when connecting via TCP (it's a requirement, the default if omitted is unix socket).The code that derives attributes from the URL depends on explicit args, so TCP connections will never get their
net.transport
(and won't get things likenet.peer.port
if using defaults).We'd have to inspect the cursor for every vendor to get any information not inside the URL.
In the commit I just pushed up we are doing that for postgresql, but I feel like with this much divergent logic the test suite would basically need access to every supported SQLAlchemy engine.
Revisiting the original scope of this PR:
There is a bug when using sockets for postgres specifically because of unreachable (in tests) vendor-specific code.
I think the initial commit b03b44f addresses the bug without losing anything useful / introducing anything dangerous-- maybe enhancing the attribute recognition is better suited for a bigger PR?