Relationship Graph
View Issue Details
| ID | Project | Category | View Status | Date Submitted | Last Update |
|---|---|---|---|---|---|
| 0002921 | mantisbt | sql | public | 2003-02-10 10:00 | 2008-08-12 09:36 |
| Reporter | 111 | Assigned To | grangeway | ||
| Priority | normal | Severity | major | Reproducibility | always |
| Status | closed | Resolution | unable to reproduce | ||
| Platform | x86 | OS | Windows 2000 | ||
| Product Version | 0.17.5 | ||||
| Summary | 0002921: search gets slow as the number of bugs/bugnotes grows, i re-write the simple text search code | ||||
| Description | I have over 5000 rows in bugnote_text/bugnote tables, and over 500 open cases. the PHP script (view_all_bug_page.php) times out on me after 30 seconds if a string is specified in the "search" textbox | ||||
| Steps To Reproduce | simply do a search on this demo site and seem how long it takes. | ||||
| Additional Information | I re-rewrite the "Simple Text Search" code and it now takes only 1 or 2 seconds to search against the same amount of data. | ||||
| Tags | No tags attached. | ||||
|
i am the one opened this case. |
|
|
Could you describe the changes you made? This code will likely no longer work directly in the CVS codebase so someone will need to apply the general idea of the change. What was your solution to the problem? |
|
|
the idea is to break one JOIN that involves 4 tables into 2 JOINs that each involves 2 tables. This uses much less resource when many bugs have lots of bugnotes. I believe use the new mysql command "EXPLAIN SELECT" in mysql v4 should help explain why. I added comments to the code below. --start-- #search in the bug fields for $f_search $result3_1 = db_query( $query3_1 ); #concatenate bug ids, delimited by "," #concatenate bug ids, delimited by "," if($t_bug_ids){ $t_columns_clause = " $g_mantis_bug_table.*"; |
|
|
It seems to me like the joins are of no real benefit anyway. Wouldn't it be even quicker to do 3 separate selects on each individual table (bugnote_table, bug_table, bug_text_table). In addition, I also thing we should weed out duplicate bug IDs manually instead of using the SQL "IN" function. I would test this and benchmark it, but unfortunately my Mantis installation is tiny with not many bugs or bugnotes, so benchmarks wouldn't be all to conclusive. |
|
|
My understanding is that it's the IN clause isn't removing duplicates, it's making sure we only select data for the bugs we're interested in. I hate that we're sending a huge string as the query but I'm not sure there's a better solution here is there? We need to get the data on the correct bugs, sorted by the correct field and in a defined order so when we go page by page it works right. I've been doing the sorting in PHP code in a lot of cases but since some of this includes this could be problematic. Actually I guess we could pull out the date as a number of seconds since the epoch and could sort on that. Anyway, I'm blathering... I haven't looked closely at this code yet. |
|
|
I've managed to get the new BOOLEAN text search working using MySQL 4 - seems quick enough with several thousand bugs and is MUCH more useful for finding stuff. Of course the stuff above about splitting the search into seperate queries may still speed things up again. Replace: array_push( $t_where_clauses, With: array_push($t_where_clauses, "MATCH (summary, $t_bug_text_table.description, steps_to_reproduce, additional_information, $t_bugnote_text_table.note) AGAINST ('".addslashes($c_search)."' IN BOOLEAN MODE)"); (I don't unclude the bug id in the search as we use a per-project numbering system with a project-specific prefix. The jump facility goes to a given bug now anyway.) I would be interested if anyone who was suffering with slow searches could try this and see how fast it is? |
|
|
I have put in a not about this some time ago in somewhere, but wouldn't it make sense to take all of 'words' in the bug notes, descriptions and other text fields and directly index them in a table? You can get very fancy with this... That would enable something that I've been very interested in, the ability to do 'real' searches. I'd like to be able to type in the query text -> 'speed motor noise' and have it pull up the bugs that have each of these terms in them (as opposed to the bugs that have the string 'speed motor noise'. This gets really important as the size of the database increases. |
|
|
anyway, here is my code again, in case someone also found the search un-acceptably slow, now should be in filter_api.php. |
|
|
hucker, I think what you're describing is the MySQL full-text search. It creates a word index and allows various boolean searches, like Google. We now have full-text enabled for words of 2 characters and up, and it works extremely well. There are draw-backs to the full-text search as you can't search for "non-word" strings such as ASX-45-thing, so we have a toggle for "Exact match" versus "Words (boolean)" options when searching. |
|
|
Thanks BZheng! Works like a charm. |
|
|
Do the 0.19 alpha's still have the issue discussed here? |
|
|
i've added a related issue to 0004345 which may be similar to this. In bug 0004348, there is a patch to support displaying query times for each db_query command. If anyone is having any slow queries, if you could apply that patch and attach a copy of your query list here ? |
|
|
Thank you for taking the time to report a problem with mantis. A number of optimisations have been applied to the filter code in 1.0, 1.1 and even the upcoming 1.2. Could you let us know with a new bug report if there are issues unaddressed by these changes. Since this problem report was originally made, a number of releases have occurred. Unfortunately you are not using the latest version and the problem might already be fixed. Please download the latest release from http://www.mantisbt.org/download.php If you are able to reproduce this bug in the current release, or have some more information on how this feature could be improved in the current release. Please either change the mantis version on this bug report Again, thank you for your continued support and report. |
|
related to
child of
duplicate of