Welcome to the Cos4Cloud iSpot User Group2nd iForum LIVE! The focus of this session is: AI and iSpot: a spotlight on the [email protected] API.
Please join us for a LIVE scheduled chat discussion with the [email protected] API development Team and iSpot Admin on Wednesday, October 12th 5:30 p.m. BST / 6:30 p.m. CEST. 2022-10-12T16:30:00Z
This is part of the Cos4Cloud iSpot User Group set up for testing, gathering feedback, etc. on the integration of Cos4Cloud AI Services from the iSpot Community as they use these services.
Cos4Cloud is a European Horizon 2020 project supporting citizen science by developing technological services to address challenges shared by citizen observatories helping them boost the quantity and the quality of observations. The Open University is a project partner and iSpotnature.org is a Cos4Cloud citizen observatory and this role includes integrating and testing relevant services. iSpot is currently trialling two Cos4Cloud image recognition technologies the [email protected] and the FASTCAT-Cloud .
iForum LIVE! sessions are scheduled group discussions which members of the group can join LIVE on the date and time publicised. Users are also invited to add comments and contribute anytime.
I do use [email protected]’s web interface on occasion - if I need my memory jogged as a name, or if I’m stumped, or if I want a second opinion because it’s a plant outside my usual experience. (From reviews I have the impression that Seek is the best identification App, but I don’t have any device that will run it, and most other Apps require live InterNet access - Google Lens sort of works on the web.)
I understand that this conversation is with INRIA ([email protected]) rather than OU (iSpot) staff, so issues with how iSpot is interacting with [email protected] may not be what this conversation is identified to be about. However I’ve copied in below my post from the end of the last thread should anyone want to discuss the points contained therein.
It’s not clear to me what the objective of this integration of [email protected] Identify with iSpot is - it isn’t necessary to incorporate identifications from the AI in the iSpot user experience to trial the AI.
Regardless, the way it is being applied makes that AI look worse than it is.
The Pl&ntNet Identify global database is being used, which results in similar plants not found in Britain and Ireland being picked up. (If Lotus ucrainicus was present in Britain it and Lotus corniculatus may well be considered a difficult pair - as it is distinguishing British Lotus isn’t trivial, and these two are closer than any pair of British species.) [email protected] Identify has a West European data set which would work better.
The top 3 suggestions from [email protected] Identify are being given, regardless whether the percentage probabilities are 90/5/5, 32/31/30, or even 10/7/4, which are all very different situations, and by allowing low probability IDs through exacerbates the problems of using the global database. Setting a 30% threshold would give 0-3 alternatives per observation, and eliminate the worse solecisms. (Or the threshold could be set even higher.)
I have a strong suspicion that taxonomic differences between iSpot’s dictionary and [email protected] Identify’s being ignored. Rather than just stating that [email protected] Identify’s proposed identification is not in the iSpot dictionary, it could be checked whether iSpot has it as a synonym and replaced by the name iSpot accepts if does. (This wouldn’t have caught pannonicum, as iSpot only has pannonicum subsp. maritimum .)
[email protected] uses flower/leaf/fruit/bark/habit hints, which are usually not available with iSpot observations.
[email protected] Identify is being fed red algae, brown algae, green algae, and bryophytes, but it can only cope with vascular plants.
That [email protected] Identify is setup to identify species, rather than larger groups, is a failing there, rather than with the integration.
An obvious way to “trial” the AI is to run it over every vascular plant observation in iSpot, and see in what proportion of cases the AI’s preferred identification matches the likely ID. Interpreting this number has a number of pitfalls, but you could get a better handle on it by calculating a quality grade for iSpot likely IDs ([email protected] Identify has one built in - its percentage figure) and seeing how the rate of agreement varies with grade.
A variation would be to first set up a Britain and Ireland data set at [email protected] Identify.
Another line of attack would be to extract a set of “research grade” observations from iSpot, and use that to train the [email protected] Identify engine. Then repeat the above. However [email protected] Identify’s Western Europe dataset has an average of 900 photographs per taxon; I don’t have access to the comparative number for iSpot, but a back of the envelope calculation suggests a fewfold smaller data set. So, it can’t be guaranteed that there’s sufficient data to train the engine. But it does offer an opportunity with things other than non-vascular plants.
It seems to me that you could train the engine twice - once with species, and once with genera. Then if it doesn’t give a high probability identification to species, you ask it again for genus. This would help with taxa like Taraxacum, Hieracium, Rubus, Salix, Limonium, Oenothera, Epilobium, …
We have developed a service that predicts the species list depending on the location/habitat: [email protected] identify. It is also accessible via the API and can be used to filter species. It will be integrated in the mobile application soon (with an option to “filter the identification results” by the species around me).
Concerning the choice of the flora, it is up to iSpot to decide But an important information is that we will soon use the Kew checklist which will allow us to have flora by country for all the countries of the world.
The [email protected] web interface is set up to use user hints (flowers, fruits, leaves, bark, habit, other) with images, but this information is not reliably available with images loaded onto iSpot. How would this effect [email protected]’s accuracy? (I would have thought that [email protected]’s AI would be rather good at categorising images into these categories - perhaps better than users (for example the recent observation on iSpot where what I think was an Oxalis leaf was described as a flower).)
I note that the image loading page on iSpot could be amended to require such categorisation, though the fact that iSpot deals with organisms other than vascular plants might mean that this would be trickier than at first sight.
[quote=“lavateraguy, post:14, topic:1713”] [email protected] uses flower/leaf/fruit/bark/habit hints, which are usually not available with iSpot observations.[/quote]
There is a hidden way to make use of this feature which is to put those terms into the names of the image files you upload. Sounds a bit of a kludge but this was the easiest and quickest way to incorporate that feature without masses of coding.