Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inheritance and indexing #1038

Closed
chris-roerig opened this issue Jan 3, 2017 · 11 comments
Closed

Inheritance and indexing #1038

chris-roerig opened this issue Jan 3, 2017 · 11 comments

Comments

@chris-roerig
Copy link

Hi Pat,

I was wondering if it's possible to index a child model and if so, what configurations and "gotchas" should I look out for?

Example:

Class A < ActiveRecord::Base
end

Class B < A
  self.table_name = "bs"
end


ThinkingSphinx::Index.define :b, with: :active_record, delta: ThinkingSphinx::Deltas::SidekiqDelta do
...
end

I'm attempting this but getting this message:
(notice both class tables are being referenced)

index a_core,a_delta,b_core,b_delta: query error: no field 'sphinx_internal_class_name' found in schema - SELECT * FROM `a_core`, `a_delta`, `b_core`, `b_delta` WHERE MATCH('@sphinx_internal_class_name (B)') AND `sphinx_deleted` = 0 ORDER BY `id` desc LIMIT 0, 10000 OPTION max_matches=10000; SHOW META
@pat
Copy link
Owner

pat commented Jan 4, 2017

Hi Chris,

I've just quickly mocked up an example on my own machine, and I think this should work fine?

Can you let me know which versions of Thinking Sphinx and Rails you're using, so I can double-check my local example. Also: it sounds like you've got an index on the parent model as well?

@chris-roerig
Copy link
Author

chris-roerig commented Jan 4, 2017

Sphinx 2.2.5-id64-release (r4825)
thinking-sphinx (3.1.4)
Rails 4.2.5.2

it sounds like you've got an index on the parent model as well?

Yes that is correct.

Here is an example from my Rails console:

Notice the first query does not produce an error but calling a method on the Search object does.

portal(dev)> B.search
  Sphinx Query (2.9ms)  SELECT * FROM `a_core`, `a_delta`, `b_core`, `b_delta` WHERE MATCH('@sphinx_internal_class_name (B)') AND `sphinx_deleted` = 0 LIMIT 0, 20
=> [#<ThinkingSphinx::Search:0x3fc7fbae36cc>


portal(dev)> B.search.first
  Sphinx Query (2.1ms)  SELECT * FROM `a_core`, `a_delta`, `b_core`, `b_delta` WHERE MATCH('@sphinx_internal_class_name (B)') AND `sphinx_deleted` = 0 LIMIT 0, 20
ThinkingSphinx::QueryError: index a_core,a_delta,b_core,b_delta: query error: no field 'sphinx_internal_class_name' found in schema - SELECT * FROM `a_core`, `a_delta`, `b_core`, `b_delta` WHERE MATCH('@sphinx_internal_class_name (B)') AND `sphinx_deleted` = 0 LIMIT 0, 20; SHOW META
from /Users/user/.rvm/gems/ruby-2.1.6/gems/thinking-sphinx-3.1.4/lib/thinking_sphinx/connection.rb:93:in `rescue in query'

@pat
Copy link
Owner

pat commented Jan 5, 2017

Okay, I've found the cause of the problem. Thinking Sphinx is written to handle single-table inheritance, but there's some inconsistencies when it comes to multi-table inheritance (which is what you're doing).

I've just pushed a fix to the develop branch - 2831ec9 - but given you're on TS v3.1.4, I'd recommend reading the upgrade notes for v3.2.0 and v3.3.0 before switching to the latest (nothing major though!)

gem 'thinking-sphinx', '~> 3.3.0',
  :git    => 'git:/pat/thinking-sphinx.git',
  :branch => 'develop',
  :ref    => '2831ec9bf6'

@chris-roerig
Copy link
Author

Pat, once again you are the man! I can query successfully now but it seems that filtering might not be working.. I will investigate and either close or update this issue. Thanks again

@chris-roerig
Copy link
Author

chris-roerig commented Jan 9, 2017

Pat,
I've upgrading from Thinking Sphinx 3.1.4 to 3.3.0 (the develop branch you noted above) and that has solved my original error. However, it appears filtering now requires a value of the same type of the defined column type in the TS index.

For example,

ThinkingSphinx::SyntaxError at /foos
sphinxql: syntax error, unexpected QUOTED_STRING, expecting CONST_INT or '-' near ''115') AND `sphinx_deleted` = 0 ORDER BY `id` desc LIMIT 0, 20 OPTION max_matches=10000; SHOW META' - SELECT * FROM `foo_core`, `foo_delta`, `foo_template_core`, `foo_template_delta` WHERE MATCH('@sphinx_internal_class_name (FooTemplate)') AND `foo_account_manager_id` IN ('115') AND `sphinx_deleted` = 0 ORDER BY `id` desc LIMIT 0, 20 OPTION max_matches=10000; SHOW META

There error here is because foo_account_manager_id is being passed as a String 115 instead of and integer.

Did search options change internally between 3.1.4 and 3.3.0?

@pat
Copy link
Owner

pat commented Jan 9, 2017

Hi Chris, yes, there has been a change. The issue is two-fold:

  • Sphinx supports filtering on string attributes in recent versions (it didn't previously). A recent update to Riddle/TS has meant that strings are now properly handled for filters, but that definitely breaks things for values that are passed through as strings (say, from params) but are for non-string attributes.
  • Thinking Sphinx doesn't introspect on the configuration when searching to determine types for filtered attributes (it's possible, but it's a lot of extra overhead).

So, the future release will be 3.4.0 due to the change in behaviour… I'll ponder whether there's a neat way to add some kind of deprecation warning in to help people avoid such problems.

@chris-roerig
Copy link
Author

Thanks Pat!

@pat
Copy link
Owner

pat commented Jan 12, 2017

Okay, so I've just pushed 839d480 to the develop branch. This adds a search middleware that checks attribute filters against known names/types, casts values as needed and outputs a deprecation warning. I'm sure it's not perfect, but it should catch most situations.

The configuration parsing is done once per thread, and is from the generated Sphinx file rather than app/indices, so the overhead is quite minimal. Also means that the next release (whether it be 3.3.1 or 3.4.0) can be considered non-breaking :)

@akostadinov
Copy link
Contributor

For single table inheritance, are we supposed to create separate indies for each sub-model or just one for the parent? Because it seems like defining only for the parent does not work. I would like to setup parent and all children to be handled automatically.

@pat
Copy link
Owner

pat commented Jan 25, 2022

Hi @akostadinov - STI models with an index on the parent should work (there are tests for it) - but if you're having problems, can you create a new issue with some details, including the index definition and the searches that you're finding aren't working? Thanks!

@akostadinov
Copy link
Contributor

akostadinov commented Jan 25, 2022

Yes, it works. In fact in the project I'm working on, there is logic to do the updates in background jobs. And figuring out callbacks reference names for child models is tricky. Would be really useful if sphinx can register the callbacks as background jobs as well or make easy to register as such. I will create an issue (see #1215).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants