Bug 250886 - Search via IMDB not working correctly, if you miss "the" in the name of movie.
Summary: Search via IMDB not working correctly, if you miss "the" in the name of movie.
Status: RESOLVED FIXED
Alias: None
Product: tellico
Classification: Applications
Component: general (show other bugs)
Version: 2.3
Platform: Gentoo Packages Linux
: NOR normal
Target Milestone: ---
Assignee: Robby Stephenson
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-09-11 15:54 UTC by Fest
Modified: 2010-09-13 19:00 UTC (History)
0 users

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Fest 2010-09-11 15:54:32 UTC
Version:           2.3 (using KDE 4.5.1) 
OS:                Linux

I don't like to write articles in entry name (the/a/etc). Problem that updating such entry not working correctly in such case.

Why it's bug ? Cause search directly in Imdb (via browser) works even without articles.

Reproducible: Always

Steps to Reproduce:
1) Add entry "Dark Knight". (Or some other movie that name starts with "the" - "The Fifth Element for example")
2) Update entry via IMDB.


Actual Results:  
Search return nothing or some other movies.

Expected Results:  
Search return "The Dark Knight"
Comment 1 Fest 2010-09-12 01:52:55 UTC
Search via "Internet Search" works. So it's definitely problem with "Update entry" search.
Comment 2 Robby Stephenson 2010-09-12 05:11:57 UTC
The subtlety for "Dark Knight" is that there is a result at IMDB that matches the title exactly, without "the". http://akas.imdb.com/title/tt0251504/

Tellico looks at that result and sees that it matches "better" than the result for "The Dark Knight". That's why it doesn't match without the article.

If that movie title without the article did not exist, then your expected results would happen.

So to get what you describe, I'd have to loosen the matching strictness. I'm not sure I want to do that, especially since it sounds like you're describing a case where the match is against nothing but the title. If you add "Christopher Nolan" as the director, Tellico's matching algorithm matches "The Dark Knight" instead of "Dark Knight" because it has more field values with closer matches.

Maybe there's a way to improve it without degrading the existing matching. I'll have to think about it.
Comment 3 Robby Stephenson 2010-09-12 05:33:00 UTC
SVN commit 1174352 by rstephenson:

Change match algorithm when updating an entry to return more good matches when multiple results exist.
BUG: 250886


 M  +4 -0      ChangeLog  
 M  +12 -1     src/entryupdater.cpp  


WebSVN link: http://websvn.kde.org/?view=rev&revision=1174352
Comment 4 Fest 2010-09-12 05:36:50 UTC
Sorry, but problem in "Update entry" search. And it's not even in "the" article. "Update entry" searching only for exact name, but "Internet Search" searching a lot deeper.

Examples:
Dark Knight (without the):
1)"Update entry" - not found (something else instead)
2)"Internet Search" - found.

Fifth Element (without the):
1)"Update entry" - not found at all.
2)"Internet Search" - found.

Die Hard 4 (instead of "Live Free or Die Hard"):
1)"Update entry" - not found at all.
2)"Internet Search" - found.

Terminator 2 (without ": Judgment Day"):
1)"Update entry" - not found at all.
2)"Internet Search" - found.

As you see "Update entry" search not working if search string not 100% correct.
So it's definitely not the same search (i mean way/command to get info from imdb). And since "Internet Search" works perfect (and much much faster), maybe use it in "Update entry" menu too ? 

Cause filling all entry's to find info - is really bad idea. Search is to fill info, not otherwise.
Comment 5 Robby Stephenson 2010-09-12 18:06:49 UTC
(In reply to comment #4)
> Sorry, but problem in "Update entry" search. And it's not even in "the"
> article. "Update entry" searching only for exact name, but "Internet Search"
> searching a lot deeper.
>
> So it's definitely not the same search (i mean way/command to get info from
> imdb). And since "Internet Search" works perfect (and much much faster), maybe
> use it in "Update entry" menu too ? 

Trust me, I wrote the code, and both the search and the update definitely _do_ use the same search. It's not the searching, it's the algorithm that attempts to match the search results with the entry that you're updating in order to determine whether a search result matches the entry being updated.

Ideally, the user shouldn't have to pick from _all_ the search results when updating values. That's not a good workflow and it's what you're asking me to do. Ideally, the update takes place without additional user input.

Tellico attempts to figure out which search result is best and then updates from that. When it has nothing but the title to use for determining the best match, you get your initial result with "Dark Knight", as I explained in comment #2.

The reason why the search seems faster is that it doesn't grab all the details form every search result. The updating does grab all the details (which requires more download pages) since it has to compare all the field values to determine the matching.

If you're able to try the current SVN, I suggest you see if my last commit might help you in what you're doing. Otherwise, check the next release and let me know. It's an attempt to compromise between your request and what I would prefer the user interaction be.
Comment 6 Fest 2010-09-12 23:27:57 UTC
I have no idea have to make svn ebuilds, so i applied you patch locally.

Result:
Dark Knight - found !
Terminator 2 - not found.
Die Hard 4 -  not found.
Pirates of Caribbean - not found.
So it's not really saves most situations.

Maybe I'm wrong, but i see use case of "update entry" like this:
New entry - write title - update entry to fill info about entry.
And since update entry understand only almost exact title - it's almost useless. Who got patience to write such titles : "Pirates of the Caribbean: The Curse of the Black Pearl" ? More than 50 symbols!

I agree with you about ideal search, without user picking up result. But it's nearly impossible in a lot use cases. And since "internet search" works perfectly, maybe you can do something like this:
if "update entry" not found result -> switch to "internet search" algorithm instead?

IMHO it's good idea, first of all, search trying to get result without user interaction, if failed - search expands.
Comment 7 Robby Stephenson 2010-09-13 07:30:50 UTC
(In reply to comment #6)
> Maybe I'm wrong, but i see use case of "update entry" like this:
> New entry - write title - update entry to fill info about entry.

You're right, that's probably at the root of our disagreement. I see the use case of adding an entry by using the internet search dialog. Why wouldn't you?

1. search "dark knight"
2. Click add
3. search "pirates of the caribbean"
4. click add
5. search "die hard"
6. click add, click add, click add.

That seems much easier to me. Why would you use the update command for adding new entries? Your use case is completely opposite of what the command is intended to do.

> if "update entry" not found result -> switch to "internet search" algorithm
> instead?

You're not reading what I wrote. There is no other algorithm, the search results are the same for both cases. And for the dark knight case, it's not that a result is not found.
 
> IMHO it's good idea, first of all, search trying to get result without user
> interaction, if failed - search expands.

That's just it, the update command is not intended to be a search. It's intended to update the existing entry by finding the search result with the best match.

If you want to add new entries, use the search dialog directly. That's what it's for. I think we'll just have to agree to disagree here...
Comment 8 Fest 2010-09-13 17:56:27 UTC
Now I understand. I used it that way, from the begging of using Tellico. Never even thought that it for different use ("internet search" is hidden in the menu and "update" in context menu, so it's much faster to click).

I added "internet search" button to toolbar, so it's not critical for me.

Anyway, if i ever make/ask-someone-to patch "update entry" for such behavior:
if "update entry" not found result -> switch to "internet search" results instead.
It's ok for you, or wontfix ?

P.s: patch for "the" article works fine, so i close bug.
Comment 9 Robby Stephenson 2010-09-13 18:28:29 UTC
(In reply to comment #8)
> I added "internet search" button to toolbar, so it's not critical for me.

That might be a bug. It's supposed to be in the toolbar by default.

> Anyway, if i ever make/ask-someone-to patch "update entry" for such behavior:
> if "update entry" not found result -> switch to "internet search" results
> instead.
> It's ok for you, or wontfix ?

Heh, I don't think you're quite getting it. There's no "switch to internet search". The Update Entry command _uses_ the internet search and gets the _same results_. Then the updater looks at all the results and finds the one with the closest match and uses that to update the existing data. I don't know how to explain this any other way. There's no difference in the results!

When you see nothing happen, it means that none of the search results match well enough, not that there were no results to begin with. So it sounds like the patch you would want would be to remove the matching comparison altogether and I won't do that.

For example, searching for "Die Hard 4" probably gets the "Live Free or Die Harder" result. But it compares those two titles and decides that they don't match at all (which they don't). And I can't think of a way for Tellico to know that those two results, with completely different titles, are actually the same movie.
Comment 10 Fest 2010-09-13 19:00:25 UTC
Ok. I think i get it. It's the same search, that give you different result, because of "update" trying to get result that match title?

If it so, how about such behavior patch:
"Update" not found exact match - expand match criteria too similar titles (or letting imdb decide what matching)?

Cause right now "update" comparison is too tight and returns no match for:
Pirates of the Caribbean - only for "Pirates of the Caribbean: The Curse of the Black Pearl".
Die Hard 4 - only for Live Free or Die Hard.
and so on.

I agree with you that titles not really the same. But imdb search compare this titles and gives the right result in the top of propositions. 

So i propose not to remove/break/change default matching comparison in update, but to switch to other options only if no result returned.