iForum LIVE!: AI and iSpot: a spotlight on the PlantNet API

here is another link that is not the LIVE discussion where are you?
the header says LIVE

Their training would have used a lot more than 6 (they mentioned hundred or thousands per taxon). But I too have found that were they have got things wrong some of the exemplars are also wrong i.e. the exemplars were not the species they were supposed to be.

Iā€™ve inferred that the web interface selects the exemplars from the training set based on similarity to the querent - I would be very surprised if the training set for Lamium album lacked any images of flowers.

Some taxa have less than 6 examplars provided by the web interface. The mean number of elements in the training set for a taxon is near a thousand, but taxa with a low number of examplars makes me wonder whether no lower cut off was implemented. Or these might be user contributed images for a taxon not in the original training set.

reading back
this is VERY short of Regular, even Irregular, Users. Have you yet worked out why?
Gulvain above makes useful points - why has no-one responded?
Louisaā€™s really relevant RECENT (above) question remain unanswered

Janice wrote elsewhere " We are gathering all the feedback, hence why these types of discussions are such an important part of the process,"
There is no evidence of the Development Teams presence in my Project - two have Favourited it)
There is little evidence that feedback is being used to form the next stage (Phase) of this trial.
There is clear feedback in each of the Live Forums and via my project but still no statement as to what next.
What next please?

The lack of thread coherence regarding the various AI applications is not helping if garnering response is an objective. The plant and mammal applications should have been considered distinct projects.

From my standpoint the introduction of PlantNet does not constitute a trial, unless it was for the technical exercise is getting PlantNet to automatically detect plant observations and add a comments. However demonstrating that third party software can be added to iSpot seems a relatively narrow objective.

I therefore assume that the users and their observations are really what it is about. I do not recall seeing a trial plan. Something stating the aims, and objectives and of course the duration. Were the point to check PlantNet results against iSpot identifications then it could have been done entirely invisibly. Given that the oversized comments have been added then you can only assume that iSpot user response is required. Therefore there must a mechanism for doing this directly from the comment, otherwise what sort of user feedback data are you going to be analysing. At the moment there is feedback lurking all over the place and there is no obvious route to analysing it. The only way of getting structured usable data is if you actually control how it is obtained.

The PlantNet trial should also have useful technical documentation on the principles of the AI recognition used. At the end of the trial there should be a published analysis including a among other things how the trial performed against its objectives. It would be interesting to know the statistics on PlantNet identifications versus human identifications, at both species and genus level

The trial may well be highly organised from the perspective of those running the trial, but it is not apparent from the users end. I sorry I also missed the second forum, I was mostly otherwise engaged.

3 Likes

Hi Gulvain, all,

thanks a lot for all your feedbacks, that are very useful I think. Here are a few answers and comments related to the discussion:
1- the identification performance of Pl@ntNet is dependent on the quality and type of pictures sent. Pl@ntNet prefers close-up pictures of the organs of a single specimen rather than general views of the whole plant or cover pictures of several specimens. There is no magic. If the discriminating attributes are not visible, the species cannot be determined precisely.
2- the identification performance of Pl@ntNet is also closely related to the training data available for each species. The training images used for each species can be consulted and revised in Pl@ntNet applications, e.g. here for Urtica dioica L. (galleries at the bottom of the page). For each observation, Pl@ntNet users can suggest alternative species names, with a principle of weighted majority voting (expert users have more weights than novice users).
3- Pl@ntNet should not be considered as a competitor with iSpot experts. It is the result of a huge collaborative effort of amateur and expert botanists who contributed to its learning with their photos, but of course, as any AI, it is still imperfect and uncertain in some cases. AI algorithms have the advantage to be able to identify a very number of species but they can not invent information that is not present in the submitted image or in the images of that species in the database.
4- We already have many quantitative evaluations of the identification performance of Pl@ntNet, so this not what is targeted in this trial. The objective is rather to see how iSpot members perceive this kind of technology, how they manage to appropriate it or not, do they find it useful or not. The misidentified photos mentioned in this discussion shows us for example the importance of informing the users about the type of images that should be submitted for precise identification and the type of results they can expect. Ideally, we would lova that Pl@ntNet can suggest other pictures to take, as expert botanists do. There is research on this subject but we are still far from being able to achieve it.

Kind regards,
Alexis Joly

Thanks Alexis
there are over 1100 Plantnet responses. When is this phase to be completed?
It is making a bit of a mess of Seaweed Observation - is it possible ā€˜write outā€™ that small area of plants?
You write ",The objective is rather to see how iSpot members perceive this kind of technology, how they manage to appropriate it or not," we have the facility to INappropriate, do you want us to use it?
I removes the PlantNet contribution.

That answers a lot of the questions we have been asking.

Ā§ Pl@ntNet Identify (and other apps) are a useful tool to people who are aware of their limitations. But I think youā€™ll find that the people here who use it go to the web interface. As have been pointed out before, the way it is being used by iSpot does not show it to its advantage.

Pl@ntNet Identify is keen on Lotus ucrainica (which really is similar to the British Lotus corniculatus) and Urtica morifolia as identfiications. Using location information to discriminate between congeners (well, consubgers and consectioners is probably more the issue) is more easily achieved that a sufficiently large training set to do the job, especially given the expectable limitations of querent images.

I think that the concerns about garden plants and escapes are a distraction. Theyā€™re a small proportion of observations, and for a variety of reasons I expect that Pl@ntNet will perform less well on them - garden cultivars are often distinct in appearance from their wild conspecifics, and those species are also less likely to be within Pl@ntNetā€™s coverage.

Just offering the top 3 taxa from Pl@ntNet throws away most of the useful information - we canā€™t tell the difference between ā€œPl@ntNet is as sure as it can be that itā€™s the first taxonā€ (95/2/2) and ā€œPl@ntNet has no ideaā€ (3/2/1). This has had an adverse effect on the perception on Pl@ntNet Identify.

Regarding the question of image quality, you have to expect people will provide images of lower quality, and not focussing on the desired organs. There are newbies who donā€™t know what it needed to identify the plants (on the one hand we have to explain that we often canā€™t identify a plant, particularly a tree, by a shot of the whole plant, and on the other hand that a picture of a whole plant is often sufficient to identify a plant, and even when it isnā€™t it provides the necessary context to interpret closeups of parts of the plant); there are people who use the cameras on old phones; there are photographs of plants which canā€™t be approached closely enough to get a good photograph (Iā€™ve just had that problem with a yellow-green flowered Nicotiana); there are plants which are in a clutter of other plants; there are plants which are not at an ideal stage of growth (weā€™ve just had a wetland umbellifer with inflorescences, but only uppermost leaves remaining); and so on.

Ā§ Pl@ntNet suggestions are being added a fair amount of time after observations are uploaded. I think this was informed by a desire not to discourage participation by the more expert among the user by giving them nothing to do. But this does means that Pl@ntNetā€™s contributions generally arrive after things have been settled (which means that theyā€™re of less interest), and if they havenā€™t then the observation is likely to be beyond Pl@ntNetā€™s competence. However, with the way iSpot is currently using Pl@ntNet, having it add identifications or agreements would result in use spending term correctly Pl@ntNetā€™s mistakes, which would be discouraging in a different way.

There are two things that I think should be done - change how iSpot presents Pl@ntNetā€™s output, and think about alternative models for how Pl@ntNet interacts with users. Unlike some people, Iā€™m not particularly bothered about the ā€œobtrusiveā€ comments from Pl@ntNet, but Iā€™m using a large monitor, not a phone or a tablet (but I still have to scroll over it to find any comments below it).

One alternative is an ā€œAsk Pl@ntNetā€ button both on Add and View Observation, so anyone who wants to see its output can do so, without cluttering up iSpotā€™s rendition of observations. This might increase (or might decrease) the load on your servers, but would make it less work for anyone wanting a first or second opinion.

Concern has been expressed about the interaction of automated identification with iSpotā€™s educational mission, i.e. that it inhibits the development of identfication skills. That would be a topic for you and the OU to discuss between yourselves.

Ā§ I wonder whether I should offer my services as a combination of a talented and experienced (but rusty) software engineer (full life cycle) and domain near-expert, but I assume that you have people with that combination of skills already involved.

1 Like

Iā€™ve just had a glance at a couple of species in their data.

There are about 1,000 images of Abutilon megapotamicum flowers. On a cursory scroll through I spotted about half a dozen errors (Abutilon x milleri, Abutilon x hybridum, and one unclear between Abutilon striatum and Abutilon x hybridum). A closer look might find a few more - there were some where I wasnā€™t sure whether they were megapotamicum or milleri.

There are about 5,000 images of Lamium album, and no jumping out from the thumbnail errors, so my inference that the errors I mention above were inadvertently filtered in by the exemplarisation process would be seem to be confirmed.

And thereā€™s at least one Malope trifida among the 16,000 Malva sylvestris flowers.

But the rate of errors does seem to be quite low.

Itā€™ up to ispot developers to decide which observations they want to send to Pl@ntNet or not.

Thanks ajoly. Can I ask what you mean by ā€œItā€™ up to ispot developers to decide which observations they want to send to Pl@ntNet or not.ā€.?
.
Do you mean that a person in ispot is physically determining which observations go to plantnet? (I doubt that is the meaning)
.
OR Do you mean that ispot developers have set an alogorithm to look for observations that have tags which are ā€˜plantā€™.
.

OR do you mean something else?

Pl@ntNet provides the API. iSpot selects which observations are fed into the API. iSpot has been sending in, with manual intervention but perhaps no selection, observations in the Plant group. In principle the script that they are running could be modified to check existing identifications, and filter out those which are not tracheophytes (i.e. rhodophytes, phaeophytes, chlorophytes, glaucophytes, bryophytes, ā€¦), or perhaps even more (the training set for lycophytes is rather thin). Perhaps a restriction to spermatophytes is reasonable.
Ā 
Iā€™ve been browsing Pl@ntNetā€™s data. The number of species with thousands of images is actually quite small - of the 45 species of Lotus in their data Lotus corniculatus has 16,000+ images, Lotus glaber and Lotus pedunculatus between 1,000 and 2,000, and the remaining species fewer, some down in single figures.

Hi JoC,

I mean that the choice of images sent or not to Pl@ntNet is done on the iSpot side, not on the Pl@ntNet side. I imagine that iSPotā€™s metadata could be used by iSPot developers to better target the contents sent to Pl@ntNet.

Kind regards,
Alexis

Thank you, Alexis. ā€¦

As it stands, all observations with images that are labelled with the group tag ā€œplantsā€ are submitted to the Pl@ntNet API.

Thank you Chris. All is becoming clear(er).

I was interested to see that on ID Please | Observation | UK and Ireland | iSpot Nature
the AI seems to have used lopperā€™s photo as the basis for identifying lopperā€™s photoā€¦ or is it just me not understanding what is going on?

1 Like

I think that the AI bases its identification on a portfolio of photos, but then offers the most visually similar as its examplars, which would of course include the querent if the querent had previously been contributed to PlantNet.

I thought it was the same picture. In which case we might all see our own photos being used to ā€œIDā€ our own observations. Would that count as Auto-id cloning?

.
However, now I donā€™t think the photo is the same picture; very similar, but not the same.
.
But interesting all the same.

There is no automatic process that integrates iSpot observations in Pl@ntNetā€™s database. If some iSpot observations are duplicated in Pl@ntNet database, it is probably because their author shared them in both platforms. And indeed, this creates a kind of loop. In general, the AI will just returns the same species the one originally associated with the observation.