UK recording systems

This may be of interest https://nbn.org.uk/news/introducing-inaturalist-for-the-uk/
It looks like NBN is now has launched an iNaturalist node. Its interesting though that they aren’t planning to incorporate the records into the NBN Atlas. Do iSpot records go into the NBN Atlas?

There are a few iSpot records on NBN but they have gone via other schemes and socs or irecord. We are currently updating our UK species dictionary and once that is done it will make it much easier for iSpot data to go onto NBN.

I have been treating iNat entries in the Record Portals with GREAT suspicion for a long time. I often EXCLUDE iNat Records from my analyses.
In iNat, Research Grade (RG) reciords are automatically loaded into GBIF when they become RG,
RG is ‘awarded’ after a single agreement. I have challenged quite a few of those, my opinion was strengthened by the replies. Admin did not like my investigation, reminding me that Citizen Science was the upheld theme, At least two Curators agreed that it did not seem a good system
I DO hope that iSpot records are not harvested just because someone is ‘As sure as I can be’ or that the likely banner is present. iSpot is crammed with wrong IDs carrying the likely Banner and agreements, though the latter are RARE in incorrect ones.
The claim that iNat is an ‘online social network’ pales against iSpot’s strong sense of social cohesion and responsibility.
Where are the long chatty, sometimes critical, comment trails in iNat?

my understanding is that “research grade” on iNaturalist just means that a record has more than one agreeing identification and some threshold of consensus. I don’t think its meant to mean 100% correct IDs all the time. I wonder if anyone has tried to quantify the accuracy of iSpot and iNaturalist records?

Re: long chatty comment trails - I glanced a the blog and there are some nice stories about interesting records highlighted there - such as this one https://www.inaturalist.org/blog/46272-a-euphorbia-observation-in-brazil-provides-tantalizing-natural-history-clues-observation-of-the-week-2-8-21 which links to this record https://www.inaturalist.org/observations/67163775 where theres a long chatty comment trail - its also interesting that one of the experts explicitly mentions not relying on “research grade” as a perfect metric as I suspected.

What percentage of misIdentifications in a dataset be it iSpot or iNat or GBIF or NBN do you think would be tolerable for use in scientific analyses? 95% accuracy? 99% accuracy? It seems like theres a quality and quantity tradeoff but it wouldn’t be hard for either system to tweak their algorithms to meet the desired threshold

Thanks James. I think we could each make a case to strengthen various theories here.

There are THOUSANDS of Likely (RG?) IDs in iSpot, some with agreements, that are wrong…
There are 40 observations over two recent days here (in iSpot) that have no comments at all and will probably never have.
My case is strong though, there are many thousands of Records in the GBIF Portal that arise from the iNat harvesting software, that are incorrect. I find them regularly.
They are commonly the FIRST ID in the Dictionary Drop-down and agreed to by ‘unqualified’ people.
A quick visit to RG Euphrasia in iNat is fairly convincing.
https://www.inaturalist.org/taxa/118893-Euphrasia-nemorosa/browse_photos

There appear to be no (stated) iSpot records (data) in either GBIF or NBN.
My only interest here (I subscribe and upload to both sites) relates to reasonable accuracy of World and National Records. I would not like to (I cannot) put a percentage of accuracy required. I really hope iSpot is never trawled for unverified Likely IDs to be added to World- or National-Recording schemes.
“,will make it much easier for iSpot data to go onto NBN” is from miked’s comment. I suspect he has a proviso in mind!
I have a few iNat RG records in the GBIF Portal
I may be the only person in iSpot who has ever tried (and failed) to grade the Probability of Accuracy in Observations. I subjectively based it on the number and ‘quality’ of agreements and the clarity and number of photos.

I don’t think it is a matter of achieving an acceptably low percentage of wrong records. False records are likely to be outliers in the dataset so will have more significance than just the proportion they comprise.

but surely we don’t presume that any source of biodiversity data has 100% identification accuracy. For example, the museum world has a significant problem in this regard https://www.forbes.com/sites/shaenamontanari/2015/11/17/half-the-worlds-museum-specimens-are-wrongly-labeled-but-who-is-to-blame

I agree, but I haven’t said anything about assuming 100% accuracy.

If you aren’t saying we need 100% accuracy then you are implying that there is some acceptable percentage of wrong records. What is this percentage if not 100%

And has anyone checked whether citizen science sources like iSpot or iNaturalist have higher errors than, for instance, museum collections. These are the 533 GBIF records of Agamid Dragons from the British Museum of Natural History

I can tell you for sure that this Lithuanian record is an error

That puts the accuracy rate for the British Museum Agamid Dragons at at least < 99.9% (probably lower) - why are we not up in arms about this error?

I think that is what is called a selective quote. You missed out the first few words which reverse the meaning.

There was an assessment a few years ago of the accuracy of the ispot identifications and it found that for certain groups of organisms such as birds or plants they were very well identified, similar error rate to that which experts have. With other groups such as fungi where often more features than are shown in a photo are needed the error rate was higher. It would be possible to repeat this analyis again now with more data.

1 Like

Aha - how interesting Mike.
BUT how do you know that IDs in iSpot are accurate?
There are very few IDs with ‘conclusive’ supportive evidence.
No ‘expert’ I know would ID from a photo alone - that would be the basis of 90% (my guess) of iSpot’s Observations.
Actually you used the usefully ‘vague’ expression “they were very well identified…”

Reviving this after 3 years mainly because I didn’t want to make another new topic when this one already covered what I wanted to ask. I actually didn’t know iSpot records didn’t go towards anything until I was speaking to someone from BWARS and they mentioned it essentially as a reason why they don’t really hold iSpot in great merit. It made me wonder - has that changed at all? Do our records go towards anything or are they solely kept in iSpot? In the case of the latter, it feels a shame in a way. County verifiers are stretched incredibly thinly, to such a point that a lot of records on places like iRecord go unviewed and therefore uncounted. It seems like a community push may end up being what we have to lean towards more in order to identify things even though that may be less reliable. I understand the debating from prior about the margin of error and I can’t say I personally have a solution to it but I do think it may end up being something that has to be discussed further as time goes on.

It would be interesting to know if there ever was another repeat of the accuracy analysis for iSpot. I would half expect that the accuracy across the board would’ve increased though I will say that my “hypothesis” is based entirely on anecdotal evidence - I don’t frequent records such as mushrooms (which I’d say are easily one of the hardest to ID correctly) so I’d have no clue if those records end up with incorrect IDs.

Here’s a thought though - I wonder if there would ever be a time when ‘AI’ could be used to ‘verify’ identifications. When I looked it up, Google quotes that Google ID has a “less than 2% error rate” which I personally think is a load of hogsquash because, if an image is blurry, it gets it wrong a heck of a lot. However, when the image quality is good, it can be freakishly accurate. I do wonder if something like that would be where we will end up with wildlife recording. For instance, if images were scanned and compared to images elsewhere of the user’s given identification and then given a percentage chance of accuracy, a little bit like what PlantNet does, and then observations with, say, less than 75% chance of accuracy are ignored. That’s all massive speculation coming from someone who knows nothing about verifying identifications, nor do I know anything about AI, but it’s an interesting hypothetical for the future imo.

Edit: Originally said 2 years but it’s been 3!! Time flies scarily fast…

On the where do ispot records go front. They can go anywhere, if BWARS want the records then I can give them, some schemes have wanted them e.g. the mammal soc and used the ispot records in their mammal atlas as far as I recall. The situation with irecord is more confused I have been trying to send them another batch of records for a couple of years without success, they seem to be more concerned with getting in inat records. We have had more success with NBN as there are now 140,000 ispot records on there and potentially could be all records on there, we are also looking at getting all ispot global records onto GBIF we are working on aligning the species dictionaries so this can happen but it is a long task.

« all ispot global records onto GBIF « ?
.
I would hope they only want correct records; otherwise where’s the rigour.
.
I recently engaged with GBIF in relation to an incorrect image of Geranium argenteum which they got from PlantNet and posted on gbif. They said they relied on PlantNet to ensure their records were correct.
.
Which begs the question, how would ispot records be verified as good enough for gbif.

These big aggregating systems do indeed just want data but with health warnings about its quality. iSpot data on NBN has to go on as unverified as they have no way to deal with the reputation etc from ispot yet. I suspect GBIF will be similar, once we are about to upload then will check these things again as they can change.

It is up to the users of the data to find out what the quality of the data are, there are papers about the quality of the ispot data e.g. from the large sample that were sent to irecord for verification. Also each ispot observation on NBN has a link to the actual observation so the person can check their own sample.

What the data are used for has been a main topic at several meetings in recent years, some people just want bulk data and don’t care too much about the quality (assuming there will be some noise but statistics can deal with that), others only select the best verified data, others scan through the bulk data to check if there are any important records in their area of interest then investigate those records individually to make sure they are correct.

Incidentally not wanting to cast doubt on verified data but that is not perfect either and might be getting less perfect as there are more data coming in and rather few people to check it. Could go into a lot more detail but basically nothing is perfect even when you get to DNA so it is what is acceptible for the particular purpose.

We are currently wading through many hundreds of Incorrect Locations in iSpot
I am being fed by Miked and assisted by northernteacher.
We are looking at ones which are wildly wrong, like in the Indian Ocean and are the result of a Re-location Bug or, just as often, user’s laziness.
Now it would be possible to develop AI to check locations for us?
.
Of (possible) interest here; I was the one who pushed for GBIF to be added to the ID Panel. I am very grateful.
This because it is SO much easier to anylise records (I am still finding NBN unwieldy)
But GBIF is NOT collecting all NBN iSpot records, so the two Portals are quite different.

The link to RECENT uploads of 140,096 iSpot records to NBN is here (up to the end of 2013.

Find y’r own?
and so
Is the ID really correct here (no agreements) and is the location accurate?
https://www.ispotnature.org/communities/uk-and-ireland/view/observation/228119/
It is a sea anemone and I will add an agreement and add the tag ProjectM1

There are various reasons for that and there are delays in the system so data on NBN does not immediately end up on GBIF. For example NBN only do data uploads once a month I think and they have issues with programmers at present so will be delays. I suspect there is a regular move of data to GBIF but not sure what interval there is on that. There may also be restrictions e.g. set by the original data provider on which data can be moved from NBN onto GBIF.

‘various reasons’ thnx

Thanks for the explanation, MikeD.

As you might imagine, your answer has raised several questions in my mind; one is general; is there a link on the ispot site to what we, as ispot users, have agreed to regarding use of our contributions?

Secondly, can I access any of the papers you mention about the quality of the ispot data?

I’m trying to ignore a nagging notion « Never mind the quality, feel the width ».

I keep reminding myself that Ispot is free for me, as a user, so what the OU want to do is largely outside my agency.