UK recording systems

Then the question is, is whether to submit the 75 to 85%, flag the unsent records, and find a solution to the hidden records in due course?

Hidden posts; I am wondering what sort of solution could distinguish between intentionally hidden posts, which I sometimes do - intentionally, and unintentional hidden posts.
If users with lots of hidden posts have not responded, one reason could be that they want them to remain hidden.

I think that statisticians take pride in analysing their data as exactly as possible; and as Gulvain says, they give error bars and probabilities which not everyone else does. Prediction, on the other hand, is more difficult, especially about the future.

Another issue is that some of my intentionally hidden posts I later made unhidden e.g. if something rare that is no longer there or just change my mind about whether it should be hidden or better not, generally trying to make as little as possible hidden.

If I move house, I’d unhide all the observations made in my garden.

Quantum mechanics is a tricky one. It is one of the fundamental principles of QM that you cannot measure things exactly at the atomic level as any attempt to measure things affects the things being measured. (Heisenberg’s Uncertainty principle, although i’m not sure of the spelling.)
In general, I agree. There is no such thing as an exact science. But with enough data, pretty accurate conclusions can often be drawn. Meteorology is an interesting example. With more and more data and more and more powerful computers, the weather forecasts are somewhat better than when I had a holiday job at the Met Office 50 years ago! But they are not that much better!

@JoC and @Ken_Noble Weather models break the atmosphere up into cell, with a whole host of initial conditions. I remember discussing the accuracy of weather forecasters with a modeller who said that they had achieved astonishing accuracy with their models the only problem being is that they ran slower than real time. So in order to predict the weather accuracy was sacrificed. Computing power keeps increasing, but getting all the start conditions is still a challenge. I still think that forecasts are a lot better than they have ever been.

It is dangerous ground to confuse the observer effect with the uncertainty principle even though they are related. It gets horribly mathematical, and I have no intention of relearning enough to try and say why!

Nevertheless, in spite of there being a lot of statistics in quantum electrodynamics it is probably the most accurate theory we have. Any experiment worth its salt will have error bars on the graphs and its results.

Perhaps you know the joke about astronomers (that they have error bars on their exponents), though that is somewhat dated nowadays, with 50 years of improvement in instruments.

Not heard that one. I like though

The only similar one I know is that if you lay 50 economists end to end they won’t reach a conclusion.

I will say that all of this talk about statistics and error margins is way above my pay grade. All I can remember is when I did A-Level Geography, we had to study things like which grasses had been observed where and how accurate these things could be. In the end, the conclusion we reached was to randomly throw a quadrat and count the different species without saying which species were which. That still came with issues though because you’d have to be very well-acquainted with grasses to tell each species apart so we were likely miles off in our results. After that, we reached a new conclusion - throw the quadrat randomly and simply count which squares had greenery on them and which didn’t, then work out a percentage of coverage and then an overall average. A-Level Geography involved a lot of statistics which is why I struggled so much with it because statistics are horrifically difficult to get to a wide agreement on.

Another time was when I had to count and determine sediment types along beaches. Funnily enough, it’s hard to determine whether a red piece of rock the size of a grain of sand is x, y or z. In those instances, you just have to go one step back. I ended up just logging colour and size and figured out the average colour and average size which would generally give at least a hint to which type of sediment was most abundant.

This is all essentially useless information but I had no clue if these methods were remotely useful to anything. I suppose taking one step back doesn’t always work with verifying records - my first thought was to just clump them all as genus unless they are very obviously incorrect. Like, if I recorded Entomobrya intermedia, it could just be sent across as an Entomobrya identification and then I don’t know what next, I haven’t thought it through. Perhaps if records can at least be put to genus and then can be verified by an expert, that way they are still being recorded more specifically than “bird” or “insect” in the meantime.
.
I’m guessing that you can’t just chuck random math at it though e.g just use
image and call it a day :rofl:
Also, I got a D in Maths in school and didn’t understand what division was until I was 12 so obviously take every single thing I say here with a grain of salt.

“Since 95% of values fall within two standard deviations of the mean according to the 68-95-99.7 Rule, simply add and subtract two standard deviations from the mean in order to obtain the 95% confidence interval . Notice that with higher confidence levels the confidence interval gets large so there is less precision” - I’d love to know what half of this even means. It sounds interesting but I’m absolutely useless at maths. When it comes to a conversation involving maths, I become the turd in the pool :joy::joy:

1 Like

The standard deviation is a measure of how spread out observations are relative to the mean. (The mean being the sum of all observed values divided by the number of values.) A small standard deviation means that the observations are more clustered around the mean, and a large standard deviation that they are less clustered about the mean.

For a particular distribution (the normal distribution) of values it can be calculated 68% of values are less than one standard deviation away from the mean, 95% are less than two standard deviations away, and 97.7% are less than three standard deviations away. (These are approximate numbers - to two decimal places they are 68.27%, 95.56% and 97.73%.)

When doing research you often do not know the true distribution beforehand, and because a randomly selected sample does not accurately reflect the true distribution, you want to know how the experimentally measured mean relates to the mean of the true distribution. If you add and subtract two standard deviations from the experimental mean you are 95% sure that the true mean lies within that range. In much scientific research (physicists, or at least particle physicists, having been burned by discovering particles that didn’t exist, adopt a much tougher standard) that 95% number is considered “statistically significant” and treated as a scientifically valid result. For example, if you want to test whether a medical treatment works you see whether the mean outcome in the absence of treatment lies outside the 95% confidence limit of the outcome with the treatment. This is becoming controversial - it means that 1 in 20 published scientific results are wrong - and with the possibility of experimental biases it’s actually worse than that.

The last sentence just means that the wider the range, the more confident you are that the true value lies within the range.

As for the difficulty of statistics, forty odd years ago I encountered an intermittent hardware bug. Every few hundred writes a magnetic tape drive would prepend an additional byte to the data being written. The appropriate model here was a Poisson distribution (rather than a normal distribution, which I more or less understood); I had to find a tame mathematician to tell me how many trials were necessary to be reasonably sure that the bug was fixed. In the process of investigating this (proving it was a hardware problem) I destroyed a magnetic tape cartridge or several by writing over the same bit of the tape over and over again.

I remember writing pseudorandom number generators for various different distributions, and then writing test harnesses to demonstrate that the pseudo random numbers were indeed in the correct distribution.

I don’ think I would be able to understand the maths anymore. Ah well

I can identify with that. At university, I more or less gave up on statistics. I reckoned that I could still get a reasonable degree if I did OK in all the other modules.
One of the most trying aspects of physics, for me, was that when you wrote up an experiment, you had to draw error bars on all of your readings and then try and work out the degree of confidence that you could place on your result. This was much harder than just calculating the result.
Since I left university, I’ve had very few brushes with statistics!

It is because of people like you that Michael Gove wants all schools to be above average.

Is that aimed at someone in particular, or at a group of us? (I find your intended meaning unclear.)

Which is worse? Politicians not understanding statistics? Or politicians not understanding exponential growth?

First, he’s got to find enough teachers who understand statistics.
I had to get help from the BTO with one of my research projects; and I was pretty reasonable at maths. (Top grade in A-level, for example.)

Thanks Jins for bring this hairy springtail to the discussion.
I learn something new on every visit to iSpot and this animal is, according to this site, not easy to identify.

The GBIF distribution is very local so, statistically speaking, I may have seen one but without your flagging it I would have not have even known it existed to be seen.

The GBIF distribution Entomobrya intermedia G.Brook, 1884

And there are 374 other species……

I’m slow. I only just got the irony of that quote.