Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Term too long issue #77

Closed
gpetukhov opened this issue May 22, 2011 · 3 comments
Closed

Term too long issue #77

gpetukhov opened this issue May 22, 2011 · 3 comments

Comments

@gpetukhov
Copy link

When I run manage.py rebuild_index I get following error:

  File "./manage.py", line 11, in <module>
    execute_manager(settings)
  File "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/django/core/management/**init**.py", line 438, in execute_manager
    utility.execute()
  File "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/django/core/management/**init**.py", line 379, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/django/core/management/base.py", line 191, in run_from_argv
    self.execute(_args, *_options.**dict**)
  File "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/django/core/management/base.py", line 220, in execute
    output = self.handle(_args, *_options)
  File "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/haystack/management/commands/rebuild_index.py", line 14, in handle
    call_command('update_index', *_options)
  File "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/django/core/management/**init**.py", line 166, in call_command
    return klass.execute(_args, *_defaults)
  File "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/django/core/management/base.py", line 220, in execute
    output = self.handle(_args, *_options)
  File "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/haystack/management/commands/update_index.py", line 184, in handle
    return super(Command, self).handle(_apps, *_options)
  File "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/django/core/management/base.py", line 286, in handle
    app_output = self.handle_app(app, *_options)
  File "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/haystack/management/commands/update_index.py", line 218, in handle_app
    do_update(index, qs, start, end, total, self.verbosity)
  File "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/haystack/management/commands/update_index.py", line 100, in do_update
    index.backend.update(index, current_qs)
  File "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/xapian_backend.py", line 257, in update
    database.replace_document(document_id, document)
xapian.InvalidArgumentError: Term too long (> 245):  4f6d6e6961206d6561206d6563756d20706f72746f202d20e2f1e520f1e2eee520edeef8f320f120f1eee1eefe0d0a566974612073696e65206c69626572746174652c206e6968696c202d20e6e8e7edfc20e1e5e720f1e2eee1eee4fb202d20ede8f7f2ee0d0a417273206c6f6e67612c207669746120627265766973

Do you have any idea how to handle this problem?

Thanks.

P.S.
python 2.6
python-xapian (debian) 1.2.4-1
libxapian (debian) 1.2.5
recent versions of django, haystack-xapian, django-xapian

@gpetukhov
Copy link
Author

I've created quick fix, it just ignores such error and I think documents with invalid terms are not added to index completely.

diff --git a/xapian_backend.py b/xapian_backend.py
index fbbe221..1884613 100755
--- a/xapian_backend.py
+++ b/xapian_backend.py
@@ -259,7 +259,10 @@ class SearchBackend(BaseSearchBackend):
                     DOCUMENT_CT_TERM_PREFIX + u'%s.%s' %
                     (obj._meta.app_label, obj._meta.module_name)
                 )
-                database.replace_document(document_id, document)
-                try:
-                    database.replace_document(document_id, document)
-                except xapian.InvalidArgumentError, ex:
- ```
                 sys.stderr.write('xapian.InvalidArgumentError\n')
  
   except UnicodeDecodeError:
       sys.stderr.write('Chunk failed.\n')

@notanumber
Copy link
Owner

This is actually raised by Xapian itself. I've left the exception to bubble up so it's possible for a developer to see the issue. The solution is to ensure your terms are no longer than 245 characters, unfortunately.

Alir3z4 added a commit to Alir3z4/xapian-haystack that referenced this issue Nov 10, 2015
Fixes notanumber#77

Not handling this exception breaks the [re]building the index.
@Alir3z4
Copy link

Alir3z4 commented Nov 10, 2015

@notanumber @jorgecarleitao What do you think about this Alir3z4@a249b46 ?

the text document is usually more than 245, I know it's raised from xapian itself but this shouldn't break the [re]building the index.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants