Summary: | USDA database incomplete & has wrong ID numbers | ||
---|---|---|---|
Product: | [Applications] krecipes | Reporter: | Anthony DeRobertis <anthony> |
Component: | general | Assignee: | Unassigned bugs mailing-list <unassigned-bugs> |
Status: | CONFIRMED --- | ||
Severity: | normal | CC: | aacid, sgmoore, unassigned-bugs |
Priority: | NOR | ||
Version: | 2.0-beta2 | ||
Target Milestone: | --- | ||
Platform: | Debian testing | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: | ||
Sentry Crash Report: | |||
Attachments: |
Patch to update to SR24
Program to aid in verifying ingredient-data-en_US.txt Patch to update ingredient data file (instead of generate from USDA data as in other patch) |
Description
Anthony DeRobertis
2011-12-18 14:13:02 UTC
Actually, looking at the old abbrev.txt file, I have no idea where that ID (for the Canadian bacon) came from—it doesn't appear in the NDB file. Updating abbrev.txt seems to have mostly worked, but it looks like there are several other data files I need to update as well. Also, it appears that there is an 'ingredient-data-en-US' which I guess was done by hand? Unfortunately, it contains incorrect data, for example: tomatoes, stewed:11693 but 11693 is crushed tomatoes. Stewed tomatoes are 11533 ("Tomatoes, red, ripe, canned, stewed" I'm going to replace it with data from FOOD_DES.txt; will upload all the new files once I finish them... Created attachment 66870 [details]
Patch to update to SR24
This updates the weights.txt file as well, even though krecipes is (unfortunately!) not using it.
The number of fields in abbrev.txt changed, updated a define.
(xz'd; Bugzilla refused the patch for being too large. USDA data files are large. Not much I can do about that. gzip and bzip both exceeded 1MB)
It appears that the ingredient-data-en-US.txt file also controls which are loaded by default, and also the common names help in recipe matching. I'm going through it, fixing it, but its going to take a bit. I'm a third of the way through, there are a *lot* of mistakes in it. Created attachment 67007 [details]
Program to aid in verifying ingredient-data-en_US.txt
Created attachment 67008 [details]
Patch to update ingredient data file (instead of generate from USDA data as in other patch)
Some entries were plain wrong—they pointed to the wrong food. Many of them were close matches, so may have come from before the correct food existed in the USDA database. But that strategy doesn't really work, as it leave incorrect data imported. So where I couldn't find a match, I just deleted the entry.
I also fixed some names, to use what they're normally called in the US. Though where that leads to confusion, I've added more to the name to clarify.
I may have missed some...
At this point, I'm done with spamming this bug for a while. Hope to actually use krecipes now...
|