In 1981 I created a program for the ZX81 called "Cashcast" which was a spreadsheet designed for budgetting. This was before spreadsheets were available. I put it on sale for £2.99 and sold about 100 copies. I should have persevered.
Thanks to Google I've found an ad for Cashcast. It was £4.95 not £2.99.
I suppose if they tried using a database it would have been MS-Access
It beggars belief that they are storing this data in a spreadsheet! Have they not heard of databases?
They aren't, the API runs from a Microsoft SQL database and the PowerBI visualisations run from it too.
I presume you're talking about the API for the Coronavirus Dashboard on gov.uk, whereas the issue with Excel storage seems to be further back in the data processing chain.
I don't think Excel is used except to generate the CSV upload files and the database is the source of truth rather than a series of Excel files. As I said the most likely source of error is people uploading XSLX files into a python script which will only work with CSVs. This has become an issue when third parties (universities) have had access so it stands to reason that the people doing the uploads didn't realise XSLX files won't work with python
The file size limitation makes no sense at all, Excel has over a million rows available and the new case per column doesn't make sense either because none of the days had more than 16.5k cases reported. Most likely Politico have a political source who also doesn't understand how these things actually work.
I've seen the XSLX into pyhon fuck up loads of times, it's definitely something that can happen in such a disparate system with hundreds of health trusts, testing centres and now universities all reporting in separately.
How does that work? Surely a Python script that is expecting a CSV file as input simply wouldn't work if given an XSLX file. The formats are completely different! There would surely be some indication that the process had failed, such as the lack of an output file, for a start.
Well it just doesn't work and it depends on what kind of error reporting he script has built in, what kind of queueing system there is and whether each upload is being properly monitored for success. As an end user (usually an admin assistant) I'm given instructions to prepare my upload in Excel with this structure and then to save it as a CSV and then upload it to the system. Loads of people are going to miss out the save as CSV step but won't stick around to see the error page (if there is one, as I said people who write the scripts like to assume that everyone understands that python doesn't work very well with XSLX files so won't use them).
Don't get me wrong, I'm not absolving them of anything it's a stupid error and has led to two weeks of under reporting. However, I find the file size stuff difficult to believe, especially based on a source that politico have. Once again it's lack of communication from the government here that is causing the issue. If it is a file format issue then I can understand it happening, with a script written in a rush probably using csv.reader instead of openpyxl which is more difficult to handle and more prone to parse errors.
Today is our 35th wedding anniversary. We are out for afternoon tea at the Old Course hotel later. The planned walk is looking a little problematic, however. Some minor roads are closed with flooding around here.
Congrats! - Squeeze in a few holes too?
Lord no, what a waste of time that would be. I find golf just beyond tedious. The only good things about it are the walk and the outdoors.
The sandpit things on my local course are excellent for jumping my CRF250.
You go to extraordinary lengths to achieve a certain level of popularity, don't you? Remarkable.
Leave Dura-ace alone - he`s comedy gold.
When I was 18 me and my mate did donuts on a golf course in his dad's W115 220 automatic. It open diffed and blew up the torque convertor on the second loop. His dad was livid and sent him to Sunderland Polytechnic as punishment.
Great place Sunderland Poly. At least in it's previous incarnation! Met my wife there 61 years ago!
They have a nice law school there.
Thought you went Geordie, not Mackem?
There’s no difference between Geordies and Mackems.
Today is our 35th wedding anniversary. We are out for afternoon tea at the Old Course hotel later. The planned walk is looking a little problematic, however. Some minor roads are closed with flooding around here.
Congrats! - Squeeze in a few holes too?
Lord no, what a waste of time that would be. I find golf just beyond tedious. The only good things about it are the walk and the outdoors.
The sandpit things on my local course are excellent for jumping my CRF250.
You go to extraordinary lengths to achieve a certain level of popularity, don't you? Remarkable.
Leave Dura-ace alone - he`s comedy gold.
When I was 18 me and my mate did donuts on a golf course in his dad's W115 220 automatic. It open diffed and blew up the torque convertor on the second loop. His dad was livid and sent him to Sunderland Polytechnic as punishment.
Great place Sunderland Poly. At least in it's previous incarnation! Met my wife there 61 years ago!
They have a nice law school there.
Thought you went Geordie, not Mackem?
There’s no difference between Geordies and Mackems.
Writes a Lancastrian.....
As a minimum it most be 20 miles and there is an entirely separate subset of Sanddancers in between.
Today is our 35th wedding anniversary. We are out for afternoon tea at the Old Course hotel later. The planned walk is looking a little problematic, however. Some minor roads are closed with flooding around here.
Congrats! - Squeeze in a few holes too?
Lord no, what a waste of time that would be. I find golf just beyond tedious. The only good things about it are the walk and the outdoors.
The sandpit things on my local course are excellent for jumping my CRF250.
You go to extraordinary lengths to achieve a certain level of popularity, don't you? Remarkable.
Leave Dura-ace alone - he`s comedy gold.
When I was 18 me and my mate did donuts on a golf course in his dad's W115 220 automatic. It open diffed and blew up the torque convertor on the second loop. His dad was livid and sent him to Sunderland Polytechnic as punishment.
Wow! Those in-house Merc. autoboxes were bulletproof.
There is a difference between the candidates chosen and printed on the ballot papers for the election on 3rd November and those for the election on 9th December.
The candidates for the Electoral College election may be different and it is the Electoral College election on 9th December that will determine the presidency.
Anxious to avoid chaos in the electoral college just months before the November vote, the Supreme Court ruled Monday that electors who formally select the president can be required by the state they represent to cast their ballot for the candidate who won their state’s popular vote. The justices unanimously rejected the claim that electors have a right under the Constitution to defy their states and vote for the candidate of their choice.
“Electors are not free agents,” Justice Elena Kagan said for the court in Chiafalo vs. Washington. “They are to vote for the candidate whom the state’s voters have chosen.” Article II of the Constitution and the 12th Amendment “give states broad power over electors, and give electors themselves no rights,” she said.
"The justices unanimously rejected the claim that electors have a right under the Constitution to defy their states and vote for the candidate of their choice.""Article II of the Constitution and the 12th Amendment “give states broad power over electors, and give electors themselves no rights"
So the RNC instruct Republican Electors to cast their ballot on 9th December for Pence. What State is going to use that judgement to insist that that State's Electors vote instead for Trump? Not going to happen.
States with a Dem Governor and State legislature who can award their state’s electoral college votes however they see fit.
But they won't - will they. Be serious now.
Just imagine the GOP have engaged in some voter suppression and shenanigans but the Dems win the popular vote by 5 million but lose the electoral college because of the suppression and shenanigans do you think the Dems will go along with that quietly when they can correct it?
Today is our 35th wedding anniversary. We are out for afternoon tea at the Old Course hotel later. The planned walk is looking a little problematic, however. Some minor roads are closed with flooding around here.
Congrats! - Squeeze in a few holes too?
Lord no, what a waste of time that would be. I find golf just beyond tedious. The only good things about it are the walk and the outdoors.
The sandpit things on my local course are excellent for jumping my CRF250.
You go to extraordinary lengths to achieve a certain level of popularity, don't you? Remarkable.
Leave Dura-ace alone - he`s comedy gold.
When I was 18 me and my mate did donuts on a golf course in his dad's W115 220 automatic. It open diffed and blew up the torque convertor on the second loop. His dad was livid and sent him to Sunderland Polytechnic as punishment.
Great place Sunderland Poly. At least in it's previous incarnation! Met my wife there 61 years ago!
They have a nice law school there.
Thought you went Geordie, not Mackem?
Oh of course I go Geordie. I just have visited for various talks and networking events. The facilities are nice and the staff are lovely. Just a shame it's in Sunderland really.
Wife and I enjoyed our time there. However, 30 years later one of our nieces went and hated the place. Hated it so much she abandoned her studies.
Not the world's most surprising news, but thought it was worth a mention since this is the IoD's research. The idea that we're all going to jolly back to the office was always for the birds.
Today is our 35th wedding anniversary. We are out for afternoon tea at the Old Course hotel later. The planned walk is looking a little problematic, however. Some minor roads are closed with flooding around here.
Congrats! - Squeeze in a few holes too?
Lord no, what a waste of time that would be. I find golf just beyond tedious. The only good things about it are the walk and the outdoors.
The sandpit things on my local course are excellent for jumping my CRF250.
You go to extraordinary lengths to achieve a certain level of popularity, don't you? Remarkable.
Leave Dura-ace alone - he`s comedy gold.
When I was 18 me and my mate did donuts on a golf course in his dad's W115 220 automatic. It open diffed and blew up the torque convertor on the second loop. His dad was livid and sent him to Sunderland Polytechnic as punishment.
Great place Sunderland Poly. At least in it's previous incarnation! Met my wife there 61 years ago!
They have a nice law school there.
Thought you went Geordie, not Mackem?
There’s no difference between Geordies and Mackems.
Writes a Lancastrian.....
They have the same area code and in the same county.
@MaxPB I've a bit bemused about this XLSX vs CSV issue. An XLSX is a binary file (?), completely different to a CSV, so I get that Python might not understand and will try to interpret the file regardless but surely it must have some kind of verbose error logging?
In 1981 I created a program for the ZX81 called "Cashcast" which was a spreadsheet designed for budgetting. This was before spreadsheets were available. I put it on sale for £2.99 and sold about 100 copies. I should have persevered.
Thanks to Google I've found an ad for Cashcast. It was £4.95 not £2.99.
Ah, the ZX81. One of my staff created a patient information to run on that, hooked up to a second hand TV. Cause of great admiration from patients.
PHE using Excel for data tells you all you need to know about their expertise.
More common than you think - the number of systems I've seen where you can upload data by loading spreadsheets.....
That sounds so awful.
Anywhere where the technology is driven by its users rather than by tech people with the power to say, "you tell us the requirements but we're going to solve the problem with our design not yours", it's going to have Excel sprouting all over the place. It's the tool of choice for non-technical people who get shit done.
Today is our 35th wedding anniversary. We are out for afternoon tea at the Old Course hotel later. The planned walk is looking a little problematic, however. Some minor roads are closed with flooding around here.
Congrats! - Squeeze in a few holes too?
Lord no, what a waste of time that would be. I find golf just beyond tedious. The only good things about it are the walk and the outdoors.
The sandpit things on my local course are excellent for jumping my CRF250.
You go to extraordinary lengths to achieve a certain level of popularity, don't you? Remarkable.
Leave Dura-ace alone - he`s comedy gold.
When I was 18 me and my mate did donuts on a golf course in his dad's W115 220 automatic. It open diffed and blew up the torque convertor on the second loop. His dad was livid and sent him to Sunderland Polytechnic as punishment.
Great place Sunderland Poly. At least in it's previous incarnation! Met my wife there 61 years ago!
They have a nice law school there.
Thought you went Geordie, not Mackem?
There’s no difference between Geordies and Mackems.
Writes a Lancastrian.....
They have the same area code and in the same county.
Judging by the party split (Democrat), primary participation (Democrat leaning amongst indies) and age of the North Carolina early vote (Oldish) it looks like people have done just that.
A top-flight Italian football match - Juventus v Napoli - descended into chaos on Sunday when Napoli failed to turn up in Turin because of coronavirus.
After two team members tested positive this week, Napoli say they were ordered not to travel by their local health authority in Naples, the ASL. However, Italy's Serie A football league refused to call the game off. Napoli now face an automatic 3-0 defeat.
So the spreadsheet they were using for the results reached its maximum size and simply excluded all the results that followed.
Epic fail.
WTAF? Who works with large datasets and doesn't know that Excel has a maximum file size? I despair.
Who works with large datasets in Excel?!?
A colleague has encountered this in the past. It took a long time to notice.
His problem was that he was being sent a CSV file by a third party and didn't really pay any attention to how it was being generated. If you haven't written something to check the row counts daily, you might not spot the issue, especially if it is only updating existing data and all that happens is that some fields become stale.
It isn't a surprise that something like this would happen given it is all being done a bit on the hoof.
France, Spain and the UK are all about as bad as each other.
As you pointed out earlier though, our statisticians are rankly incompetent and our data collection is embarrassing but there is no evidence of us trying to actually hide anything. Every piece of stupidity is out there for everyone to see and assess. I am not so confident that is the case with the other 2.
There is a difference between the candidates chosen and printed on the ballot papers for the election on 3rd November and those for the election on 9th December.
The candidates for the Electoral College election may be different and it is the Electoral College election on 9th December that will determine the presidency.
Anxious to avoid chaos in the electoral college just months before the November vote, the Supreme Court ruled Monday that electors who formally select the president can be required by the state they represent to cast their ballot for the candidate who won their state’s popular vote. The justices unanimously rejected the claim that electors have a right under the Constitution to defy their states and vote for the candidate of their choice.
“Electors are not free agents,” Justice Elena Kagan said for the court in Chiafalo vs. Washington. “They are to vote for the candidate whom the state’s voters have chosen.” Article II of the Constitution and the 12th Amendment “give states broad power over electors, and give electors themselves no rights,” she said.
"The justices unanimously rejected the claim that electors have a right under the Constitution to defy their states and vote for the candidate of their choice.""Article II of the Constitution and the 12th Amendment “give states broad power over electors, and give electors themselves no rights"
So the RNC instruct Republican Electors to cast their ballot on 9th December for Pence. What State is going to use that judgement to insist that that State's Electors vote instead for Trump? Not going to happen.
States with a Dem Governor and State legislature who can award their state’s electoral college votes however they see fit.
But they won't - will they. Be serious now.
Just imagine the GOP have engaged in some voter suppression and shenanigans but the Dems win the popular vote by 5 million but lose the electoral college because of the suppression and shenanigans do you think the Dems will go along with that quietly when they can correct it?
Ah. You are suggesting that States with a Dem Governor and State legislature who can award their state’s electoral college votes however they see fit will award their Electoral College votes to Biden rather than to Trump/Pence.
OK. How many States and how many ECs are likely to go Republican but have a Dem Governor and State legislature who can switch the votes? Enough to make a difference?
This is a totally different point from the BF rules, but entertaining nevertheless.
PHE using Excel for data tells you all you need to know about their expertise.
More common than you think - the number of systems I've seen where you can upload data by loading spreadsheets.....
That sounds so awful.
The shit I've seen. When this is over, at the next PB beers I'll do a standup routine on "IT systems - the bad, the insane and the stuff with the wrong number of dimensions"
I know way too much about the Java Apache POI library. Awesome though it is - for *generating* spreadsheets.
Python is a great scripting language. But for serious computing...
Today is our 35th wedding anniversary. We are out for afternoon tea at the Old Course hotel later. The planned walk is looking a little problematic, however. Some minor roads are closed with flooding around here.
Congrats! - Squeeze in a few holes too?
Lord no, what a waste of time that would be. I find golf just beyond tedious. The only good things about it are the walk and the outdoors.
The sandpit things on my local course are excellent for jumping my CRF250.
You go to extraordinary lengths to achieve a certain level of popularity, don't you? Remarkable.
Leave Dura-ace alone - he`s comedy gold.
When I was 18 me and my mate did donuts on a golf course in his dad's W115 220 automatic. It open diffed and blew up the torque convertor on the second loop. His dad was livid and sent him to Sunderland Polytechnic as punishment.
Great place Sunderland Poly. At least in it's previous incarnation! Met my wife there 61 years ago!
They have a nice law school there.
Thought you went Geordie, not Mackem?
There’s no difference between Geordies and Mackems.
Writes a Lancastrian.....
They have the same area code and in the same county.
They are synonyms for each other.
Nope. My area code is 01661. Represent.
So if I wanted to call Sunderland council and Newcastle council what are the first four digits?
PHE using Excel for data tells you all you need to know about their expertise.
More common than you think - the number of systems I've seen where you can upload data by loading spreadsheets.....
That sounds so awful.
The shit I've seen. When this is over, at the next PB beers I'll do a standup routine on "IT systems - the bad, the insane and the stuff with the wrong number of dimensions"
I know way too much about the Java Apache POI library. Awesome though it is - for *generating* spreadsheets.
Python is a great scripting language. But for serious computing...
@MaxPB I've a bit bemused about this XLSX vs CSV issue. An XLSX is a binary file (?), completely different to a CSV, so I get that Python might not understand and will try to interpret the file regardless but surely it must have some kind of verbose error logging?
Quite - unless some fool simply suppressed the error completely. So attempt to upload XLS (or XSLX) and get... nothing.
I suppose if they tried using a database it would have been MS-Access
It beggars belief that they are storing this data in a spreadsheet! Have they not heard of databases?
They aren't, the API runs from a Microsoft SQL database and the PowerBI visualisations run from it too.
I presume you're talking about the API for the Coronavirus Dashboard on gov.uk, whereas the issue with Excel storage seems to be further back in the data processing chain.
I don't think Excel is used except to generate the CSV upload files and the database is the source of truth rather than a series of Excel files. As I said the most likely source of error is people uploading XSLX files into a python script which will only work with CSVs. This has become an issue when third parties (universities) have had access so it stands to reason that the people doing the uploads didn't realise XSLX files won't work with python
The file size limitation makes no sense at all, Excel has over a million rows available and the new case per column doesn't make sense either because none of the days had more than 16.5k cases reported. Most likely Politico have a political source who also doesn't understand how these things actually work.
I've seen the XSLX into pyhon fuck up loads of times, it's definitely something that can happen in such a disparate system with hundreds of health trusts, testing centres and now universities all reporting in separately.
How does that work? Surely a Python script that is expecting a CSV file as input simply wouldn't work if given an XSLX file. The formats are completely different! There would surely be some indication that the process had failed, such as the lack of an output file, for a start.
Well it just doesn't work and it depends on what kind of error reporting he script has built in, what kind of queueing system there is and whether each upload is being properly monitored for success. As an end user (usually an admin assistant) I'm given instructions to prepare my upload in Excel with this structure and then to save it as a CSV and then upload it to the system. Loads of people are going to miss out the save as CSV step but won't stick around to see the error page (if there is one, as I said people who write the scripts like to assume that everyone understands that python doesn't work very well with XSLX files so won't use them).
Don't get me wrong, I'm not absolving them of anything it's a stupid error and has led to two weeks of under reporting. However, I find the file size stuff difficult to believe, especially based on a source that politico have. Once again it's lack of communication from the government here that is causing the issue. If it is a file format issue then I can understand it happening, with a script written in a rush probably using csv.reader instead of openpyxl which is more difficult to handle and more prone to parse errors.
That sound very unlikely to me. Who would build an uploading system for such important data that performed no server-side verification at all? Surely you'd also have client-side verification to ensure that, at the bare minimum, the uploaded files had the correct extension!
Today is our 35th wedding anniversary. We are out for afternoon tea at the Old Course hotel later. The planned walk is looking a little problematic, however. Some minor roads are closed with flooding around here.
Congrats! - Squeeze in a few holes too?
Lord no, what a waste of time that would be. I find golf just beyond tedious. The only good things about it are the walk and the outdoors.
The sandpit things on my local course are excellent for jumping my CRF250.
You go to extraordinary lengths to achieve a certain level of popularity, don't you? Remarkable.
Leave Dura-ace alone - he`s comedy gold.
When I was 18 me and my mate did donuts on a golf course in his dad's W115 220 automatic. It open diffed and blew up the torque convertor on the second loop. His dad was livid and sent him to Sunderland Polytechnic as punishment.
Great place Sunderland Poly. At least in it's previous incarnation! Met my wife there 61 years ago!
They have a nice law school there.
Thought you went Geordie, not Mackem?
There’s no difference between Geordies and Mackems.
Writes a Lancastrian.....
They have the same area code and in the same county.
They are synonyms for each other.
Nope. My area code is 01661. Represent.
So if I wanted to call Sunderland council and Newcastle council what are the first four digits?
Today is our 35th wedding anniversary. We are out for afternoon tea at the Old Course hotel later. The planned walk is looking a little problematic, however. Some minor roads are closed with flooding around here.
Congrats! - Squeeze in a few holes too?
Lord no, what a waste of time that would be. I find golf just beyond tedious. The only good things about it are the walk and the outdoors.
The sandpit things on my local course are excellent for jumping my CRF250.
You go to extraordinary lengths to achieve a certain level of popularity, don't you? Remarkable.
Leave Dura-ace alone - he`s comedy gold.
When I was 18 me and my mate did donuts on a golf course in his dad's W115 220 automatic. It open diffed and blew up the torque convertor on the second loop. His dad was livid and sent him to Sunderland Polytechnic as punishment.
Great place Sunderland Poly. At least in it's previous incarnation! Met my wife there 61 years ago!
They have a nice law school there.
Thought you went Geordie, not Mackem?
There’s no difference between Geordies and Mackems.
Writes a Lancastrian.....
They have the same area code and in the same county.
@MaxPB I've a bit bemused about this XLSX vs CSV issue. An XLSX is a binary file (?), completely different to a CSV, so I get that Python might not understand and will try to interpret the file regardless but surely it must have some kind of verbose error logging?
Ideally you would throw up an error on clicking upload that it's not a valid file format, but I've seen systems where it doesn't do anything and gives the end user no feedback on success or failure. As I said, this seems much more plausible than a file size limitation. Excel can store a million rows and literally no one uses columns for anything other than headers it's just about the stupidest idea I've heard.
If your script has been written in a rush and uses csv.reader to parse standardised CSV files it will work pretty reliably, especially in a closed system where everyone has been trained to use the system properly. It's unsurprising that this started to become an issue when third party access was granted to universities, the training probably wasn't very good and the instructions were probably ignored. I've only seen it happen about a million times.
So the spreadsheet they were using for the results reached its maximum size and simply excluded all the results that followed.
Epic fail.
WTAF? Who works with large datasets and doesn't know that Excel has a maximum file size? I despair.
The email size limit being the issue seems more believable; the maximum excel size/length is surely enormous?
Edit: Ah, if the entire list of cases is being stored in Excel I could see that being an issue! Weird that they were still able to add some cases but not all on the affected days though.
I bet the issue is that the upload system only handles CSVs (a python limitation) and Excel files were being put into it. I've seen that happen loads of times because the people making the system intrinsically understand that python works properly with CSVs but doesn't with Excel files but the people doing the uploads don't know the difference.
The format of the file sounds insane - columns for dynamic data is bad.
World beating. Betcha nobody else in the world was doing it our way
PHE using Excel for data tells you all you need to know about their expertise.
More common than you think - the number of systems I've seen where you can upload data by loading spreadsheets.....
That sounds so awful.
The shit I've seen. When this is over, at the next PB beers I'll do a standup routine on "IT systems - the bad, the insane and the stuff with the wrong number of dimensions"
I know way too much about the Java Apache POI library. Awesome though it is - for *generating* spreadsheets.
Python is a great scripting language. But for serious computing...
What do you mean by serious? Launching a nuclear strike or some data wrangling?
For the latter I don't think it really matters that much what language you use. What matters is how well it is written and managed, and how well the writer understands what they are doing.
The fact that there wasn't sort sort of checking system in place to make sure that new entries were indeed being recorded as they were added to the database is mindboggling.
@MaxPB I've a bit bemused about this XLSX vs CSV issue. An XLSX is a binary file (?), completely different to a CSV, so I get that Python might not understand and will try to interpret the file regardless but surely it must have some kind of verbose error logging?
The XLSX format is zipped XML, so yes, completely different to CSV. Any system that required data in CSV format would surely fail with an error message if you even tried to upload an XLSX file. At the very least, you'd write the client so that only files with a .csv extension could be accepted.
Today is our 35th wedding anniversary. We are out for afternoon tea at the Old Course hotel later. The planned walk is looking a little problematic, however. Some minor roads are closed with flooding around here.
Congrats! - Squeeze in a few holes too?
Lord no, what a waste of time that would be. I find golf just beyond tedious. The only good things about it are the walk and the outdoors.
The sandpit things on my local course are excellent for jumping my CRF250.
You go to extraordinary lengths to achieve a certain level of popularity, don't you? Remarkable.
Leave Dura-ace alone - he`s comedy gold.
When I was 18 me and my mate did donuts on a golf course in his dad's W115 220 automatic. It open diffed and blew up the torque convertor on the second loop. His dad was livid and sent him to Sunderland Polytechnic as punishment.
Great place Sunderland Poly. At least in it's previous incarnation! Met my wife there 61 years ago!
They have a nice law school there.
Thought you went Geordie, not Mackem?
There’s no difference between Geordies and Mackems.
Writes a Lancastrian.....
They have the same area code and in the same county.
They are synonyms for each other.
Nope. My area code is 01661. Represent.
So if I wanted to call Sunderland council and Newcastle council what are the first four digits?
0191 - but then again I'm not posh and living somewhere around Ponteland... Equally I think 0191 gives you Durham County Council and the preferred university for posh Oxford and Cambridge rejects.
Today is our 35th wedding anniversary. We are out for afternoon tea at the Old Course hotel later. The planned walk is looking a little problematic, however. Some minor roads are closed with flooding around here.
Congrats! - Squeeze in a few holes too?
Lord no, what a waste of time that would be. I find golf just beyond tedious. The only good things about it are the walk and the outdoors.
The sandpit things on my local course are excellent for jumping my CRF250.
You go to extraordinary lengths to achieve a certain level of popularity, don't you? Remarkable.
Leave Dura-ace alone - he`s comedy gold.
When I was 18 me and my mate did donuts on a golf course in his dad's W115 220 automatic. It open diffed and blew up the torque convertor on the second loop. His dad was livid and sent him to Sunderland Polytechnic as punishment.
Wow! Those in-house Merc. autoboxes were bulletproof.
High rpm with relatively light load is an excellent way to destroy an auto as the output shaft speed can drop to zero in an instant while the input shaft is still raging..
@MaxPB I've a bit bemused about this XLSX vs CSV issue. An XLSX is a binary file (?), completely different to a CSV, so I get that Python might not understand and will try to interpret the file regardless but surely it must have some kind of verbose error logging?
The XLSX format is zipped XML, so yes, completely different to CSV. Any system that required data in CSV format would surely fail with an error message if you even tried to upload an XLSX file. At the very least, you'd write the client so that only files with a .csv extension could be accepted.
I worked with CSVs and XLSX files in my last role and we did exactly this. I was using PHP (not Python albeit) but when this did happen right at the early stages it crashed on the first line.
What we used to do was check the header row first, I imagine this is probably quite a typical thing to do. Anything unexpected would throw an exception.
But I suppose this is beyond them, considering they were adding each new case as a new COLUMN
@MaxPB I've a bit bemused about this XLSX vs CSV issue. An XLSX is a binary file (?), completely different to a CSV, so I get that Python might not understand and will try to interpret the file regardless but surely it must have some kind of verbose error logging?
XLSX isn't a binary format - compressed (and truly hideous) XML would be a better statement.
I suppose if they tried using a database it would have been MS-Access
It beggars belief that they are storing this data in a spreadsheet! Have they not heard of databases?
They aren't, the API runs from a Microsoft SQL database and the PowerBI visualisations run from it too.
I presume you're talking about the API for the Coronavirus Dashboard on gov.uk, whereas the issue with Excel storage seems to be further back in the data processing chain.
I don't think Excel is used except to generate the CSV upload files and the database is the source of truth rather than a series of Excel files. As I said the most likely source of error is people uploading XSLX files into a python script which will only work with CSVs. This has become an issue when third parties (universities) have had access so it stands to reason that the people doing the uploads didn't realise XSLX files won't work with python
The file size limitation makes no sense at all, Excel has over a million rows available and the new case per column doesn't make sense either because none of the days had more than 16.5k cases reported. Most likely Politico have a political source who also doesn't understand how these things actually work.
I've seen the XSLX into pyhon fuck up loads of times, it's definitely something that can happen in such a disparate system with hundreds of health trusts, testing centres and now universities all reporting in separately.
How does that work? Surely a Python script that is expecting a CSV file as input simply wouldn't work if given an XSLX file. The formats are completely different! There would surely be some indication that the process had failed, such as the lack of an output file, for a start.
Well it just doesn't work and it depends on what kind of error reporting he script has built in, what kind of queueing system there is and whether each upload is being properly monitored for success. As an end user (usually an admin assistant) I'm given instructions to prepare my upload in Excel with this structure and then to save it as a CSV and then upload it to the system. Loads of people are going to miss out the save as CSV step but won't stick around to see the error page (if there is one, as I said people who write the scripts like to assume that everyone understands that python doesn't work very well with XSLX files so won't use them).
Don't get me wrong, I'm not absolving them of anything it's a stupid error and has led to two weeks of under reporting. However, I find the file size stuff difficult to believe, especially based on a source that politico have. Once again it's lack of communication from the government here that is causing the issue. If it is a file format issue then I can understand it happening, with a script written in a rush probably using csv.reader instead of openpyxl which is more difficult to handle and more prone to parse errors.
That sound very unlikely to me. Who would build an uploading system for such important data that performed no server-side verification at all? Surely you'd also have client-side verification to ensure that, at the bare minimum, the uploaded files had the correct extension!
It probably does do server side verification but which end user is going to stick around to see the result of that? I just don't think they've made a robust system made to handle every scenario you can throw at it.
There's a huge difference between building a system that works in a perfect environment and one that works everywhere. The former is probably what has been cooked up in a short space of time.
I'd honestly love to see the actual scripts they're using to put data into the database because they're probably extremely basic and probably don't have a fallback in case someone uploads an XSLX file.
@MaxPB I've a bit bemused about this XLSX vs CSV issue. An XLSX is a binary file (?), completely different to a CSV, so I get that Python might not understand and will try to interpret the file regardless but surely it must have some kind of verbose error logging?
The XLSX format is zipped XML, so yes, completely different to CSV. Any system that required data in CSV format would surely fail with an error message if you even tried to upload an XLSX file. At the very least, you'd write the client so that only files with a .csv extension could be accepted.
I worked with CSVs and XLSX files in my last role and we did exactly this. I was using PHP (not Python albeit) but when this did happen right at the early stages it crashed on the first line.
What we used to do was check the header row first, I imagine this is probably quite a typical thing to do. Anything unexpected would throw an exception.
But I suppose this is beyond them, considering they were adding each new case as a new COLUMN
The thought that you'd use columns for cases just seems too bizarre to be true.
@MaxPB I've a bit bemused about this XLSX vs CSV issue. An XLSX is a binary file (?), completely different to a CSV, so I get that Python might not understand and will try to interpret the file regardless but surely it must have some kind of verbose error logging?
Ideally you would throw up an error on clicking upload that it's not a valid file format, but I've seen systems where it doesn't do anything and gives the end user no feedback on success or failure. As I said, this seems much more plausible than a file size limitation. Excel can store a million rows and literally no one uses columns for anything other than headers it's just about the stupidest idea I've heard.
If your script has been written in a rush and uses csv.reader to parse standardised CSV files it will work pretty reliably, especially in a closed system where everyone has been trained to use the system properly. It's unsurprising that this started to become an issue when third party access was granted to universities, the training probably wasn't very good and the instructions were probably ignored. I've only seen it happen about a million times.
I take the point on limiting the filesize limitation but surely any sensible dev/eng would have implemented a failsafe on the backend, which they would report to a log. You'd pick it up quite quickly I would have thought, you would be looping through and instantly through an exception because it would be nonsense compared to what the interpreter was expecting.
I would have thought the library would do this for you, in fact
PHE using Excel for data tells you all you need to know about their expertise.
More common than you think - the number of systems I've seen where you can upload data by loading spreadsheets.....
That sounds so awful.
The shit I've seen. When this is over, at the next PB beers I'll do a standup routine on "IT systems - the bad, the insane and the stuff with the wrong number of dimensions"
I know way too much about the Java Apache POI library. Awesome though it is - for *generating* spreadsheets.
Python is a great scripting language. But for serious computing...
What do you mean by serious? Launching a nuclear strike or some data wrangling?
For the latter I don't think it really matters that much what language you use. What matters is how well it is written and managed, and how well the writer understands what they are doing.
It does matter. One of the problems with Python is the culture of the "developers"*. The number of times I have had to explain concepts of code structure and testing to Python and Javascript... code writers..... There is a lot of "but it runs, ship it" Python hackers out there.
I suppose if they tried using a database it would have been MS-Access
It beggars belief that they are storing this data in a spreadsheet! Have they not heard of databases?
They aren't, the API runs from a Microsoft SQL database and the PowerBI visualisations run from it too.
I presume you're talking about the API for the Coronavirus Dashboard on gov.uk, whereas the issue with Excel storage seems to be further back in the data processing chain.
I don't think Excel is used except to generate the CSV upload files and the database is the source of truth rather than a series of Excel files. As I said the most likely source of error is people uploading XSLX files into a python script which will only work with CSVs. This has become an issue when third parties (universities) have had access so it stands to reason that the people doing the uploads didn't realise XSLX files won't work with python
The file size limitation makes no sense at all, Excel has over a million rows available and the new case per column doesn't make sense either because none of the days had more than 16.5k cases reported. Most likely Politico have a political source who also doesn't understand how these things actually work.
I've seen the XSLX into pyhon fuck up loads of times, it's definitely something that can happen in such a disparate system with hundreds of health trusts, testing centres and now universities all reporting in separately.
How does that work? Surely a Python script that is expecting a CSV file as input simply wouldn't work if given an XSLX file. The formats are completely different! There would surely be some indication that the process had failed, such as the lack of an output file, for a start.
Well it just doesn't work and it depends on what kind of error reporting he script has built in, what kind of queueing system there is and whether each upload is being properly monitored for success. As an end user (usually an admin assistant) I'm given instructions to prepare my upload in Excel with this structure and then to save it as a CSV and then upload it to the system. Loads of people are going to miss out the save as CSV step but won't stick around to see the error page (if there is one, as I said people who write the scripts like to assume that everyone understands that python doesn't work very well with XSLX files so won't use them).
Don't get me wrong, I'm not absolving them of anything it's a stupid error and has led to two weeks of under reporting. However, I find the file size stuff difficult to believe, especially based on a source that politico have. Once again it's lack of communication from the government here that is causing the issue. If it is a file format issue then I can understand it happening, with a script written in a rush probably using csv.reader instead of openpyxl which is more difficult to handle and more prone to parse errors.
That sound very unlikely to me. Who would build an uploading system for such important data that performed no server-side verification at all? Surely you'd also have client-side verification to ensure that, at the bare minimum, the uploaded files had the correct extension!
It probably does do server side verification but which end user is going to stick around to see the result of that? I just don't think they've made a robust system made to handle every scenario you can throw at it.
There's a huge difference between building a system that works in a perfect environment and one that works everywhere. The former is probably what has been cooked up in a short space of time.
I'd honestly love to see the actual scripts they're using to put data into the database because they're probably extremely basic and probably don't have a fallback in case someone uploads an XSLX file.
Does the library they're using not throw an exception when it starts trying to loop through a zip file rather than a CSV?
@MaxPB I've a bit bemused about this XLSX vs CSV issue. An XLSX is a binary file (?), completely different to a CSV, so I get that Python might not understand and will try to interpret the file regardless but surely it must have some kind of verbose error logging?
Ideally you would throw up an error on clicking upload that it's not a valid file format, but I've seen systems where it doesn't do anything and gives the end user no feedback on success or failure. As I said, this seems much more plausible than a file size limitation. Excel can store a million rows and literally no one uses columns for anything other than headers it's just about the stupidest idea I've heard.
If your script has been written in a rush and uses csv.reader to parse standardised CSV files it will work pretty reliably, especially in a closed system where everyone has been trained to use the system properly. It's unsurprising that this started to become an issue when third party access was granted to universities, the training probably wasn't very good and the instructions were probably ignored. I've only seen it happen about a million times.
I take the point on limiting the filesize limitation but surely any sensible dev/eng would have implemented a failsafe on the backend, which they would report to a log. You'd pick it up quite quickly I would have thought, you would be looping through and instantly through an exception because it would be nonsense compared to what the interpreter was expecting.
I would have thought the library would do this for you, in fact
Yeah, that's probably how the error was spotted, by some junior sifting through the script success logs when a manual audit was being done.
And still 'Green' types will be unhappy, because it won't be socialism.
Oh on your bike with the culture war today. This is objectively good news. There's always more to be done but one of the few things I think this government has twigged is that money is to be made in decarbonisation.
@MaxPB I've a bit bemused about this XLSX vs CSV issue. An XLSX is a binary file (?), completely different to a CSV, so I get that Python might not understand and will try to interpret the file regardless but surely it must have some kind of verbose error logging?
The XLSX format is zipped XML, so yes, completely different to CSV. Any system that required data in CSV format would surely fail with an error message if you even tried to upload an XLSX file. At the very least, you'd write the client so that only files with a .csv extension could be accepted.
I worked with CSVs and XLSX files in my last role and we did exactly this. I was using PHP (not Python albeit) but when this did happen right at the early stages it crashed on the first line.
What we used to do was check the header row first, I imagine this is probably quite a typical thing to do. Anything unexpected would throw an exception.
But I suppose this is beyond them, considering they were adding each new case as a new COLUMN
The thought that you'd use columns for cases just seems too bizarre to be true.
I've seen it in businesses before. Shocks me every time I've seen it, but I've seen it too many times.
@MaxPB I've a bit bemused about this XLSX vs CSV issue. An XLSX is a binary file (?), completely different to a CSV, so I get that Python might not understand and will try to interpret the file regardless but surely it must have some kind of verbose error logging?
The XLSX format is zipped XML, so yes, completely different to CSV. Any system that required data in CSV format would surely fail with an error message if you even tried to upload an XLSX file. At the very least, you'd write the client so that only files with a .csv extension could be accepted.
I worked with CSVs and XLSX files in my last role and we did exactly this. I was using PHP (not Python albeit) but when this did happen right at the early stages it crashed on the first line.
What we used to do was check the header row first, I imagine this is probably quite a typical thing to do. Anything unexpected would throw an exception.
But I suppose this is beyond them, considering they were adding each new case as a new COLUMN
If that's the case, they must have got the DEFRA team in.
The old Farm Environment Plan form was easily the worst use of a spreadsheet I have ever seen.
I suppose if they tried using a database it would have been MS-Access
It beggars belief that they are storing this data in a spreadsheet! Have they not heard of databases?
They aren't, the API runs from a Microsoft SQL database and the PowerBI visualisations run from it too.
I presume you're talking about the API for the Coronavirus Dashboard on gov.uk, whereas the issue with Excel storage seems to be further back in the data processing chain.
I don't think Excel is used except to generate the CSV upload files and the database is the source of truth rather than a series of Excel files. As I said the most likely source of error is people uploading XSLX files into a python script which will only work with CSVs. This has become an issue when third parties (universities) have had access so it stands to reason that the people doing the uploads didn't realise XSLX files won't work with python
The file size limitation makes no sense at all, Excel has over a million rows available and the new case per column doesn't make sense either because none of the days had more than 16.5k cases reported. Most likely Politico have a political source who also doesn't understand how these things actually work.
I've seen the XSLX into pyhon fuck up loads of times, it's definitely something that can happen in such a disparate system with hundreds of health trusts, testing centres and now universities all reporting in separately.
How does that work? Surely a Python script that is expecting a CSV file as input simply wouldn't work if given an XSLX file. The formats are completely different! There would surely be some indication that the process had failed, such as the lack of an output file, for a start.
Well it just doesn't work and it depends on what kind of error reporting he script has built in, what kind of queueing system there is and whether each upload is being properly monitored for success. As an end user (usually an admin assistant) I'm given instructions to prepare my upload in Excel with this structure and then to save it as a CSV and then upload it to the system. Loads of people are going to miss out the save as CSV step but won't stick around to see the error page (if there is one, as I said people who write the scripts like to assume that everyone understands that python doesn't work very well with XSLX files so won't use them).
Don't get me wrong, I'm not absolving them of anything it's a stupid error and has led to two weeks of under reporting. However, I find the file size stuff difficult to believe, especially based on a source that politico have. Once again it's lack of communication from the government here that is causing the issue. If it is a file format issue then I can understand it happening, with a script written in a rush probably using csv.reader instead of openpyxl which is more difficult to handle and more prone to parse errors.
That sound very unlikely to me. Who would build an uploading system for such important data that performed no server-side verification at all? Surely you'd also have client-side verification to ensure that, at the bare minimum, the uploaded files had the correct extension!
It probably does do server side verification but which end user is going to stick around to see the result of that? I just don't think they've made a robust system made to handle every scenario you can throw at it.
There's a huge difference between building a system that works in a perfect environment and one that works everywhere. The former is probably what has been cooked up in a short space of time.
I'd honestly love to see the actual scripts they're using to put data into the database because they're probably extremely basic and probably don't have a fallback in case someone uploads an XSLX file.
Does the library they're using not throw an exception when it starts trying to loop through a zip file rather than a CSV?
It would, but it depends on who is monitoring the exceptions and whether the end user is actually notified of a script failure immediately or even by email if there is a queue.
I suppose if they tried using a database it would have been MS-Access
It beggars belief that they are storing this data in a spreadsheet! Have they not heard of databases?
They aren't, the API runs from a Microsoft SQL database and the PowerBI visualisations run from it too.
I presume you're talking about the API for the Coronavirus Dashboard on gov.uk, whereas the issue with Excel storage seems to be further back in the data processing chain.
I don't think Excel is used except to generate the CSV upload files and the database is the source of truth rather than a series of Excel files. As I said the most likely source of error is people uploading XSLX files into a python script which will only work with CSVs. This has become an issue when third parties (universities) have had access so it stands to reason that the people doing the uploads didn't realise XSLX files won't work with python
The file size limitation makes no sense at all, Excel has over a million rows available and the new case per column doesn't make sense either because none of the days had more than 16.5k cases reported. Most likely Politico have a political source who also doesn't understand how these things actually work.
I've seen the XSLX into pyhon fuck up loads of times, it's definitely something that can happen in such a disparate system with hundreds of health trusts, testing centres and now universities all reporting in separately.
How does that work? Surely a Python script that is expecting a CSV file as input simply wouldn't work if given an XSLX file. The formats are completely different! There would surely be some indication that the process had failed, such as the lack of an output file, for a start.
Well it just doesn't work and it depends on what kind of error reporting he script has built in, what kind of queueing system there is and whether each upload is being properly monitored for success. As an end user (usually an admin assistant) I'm given instructions to prepare my upload in Excel with this structure and then to save it as a CSV and then upload it to the system. Loads of people are going to miss out the save as CSV step but won't stick around to see the error page (if there is one, as I said people who write the scripts like to assume that everyone understands that python doesn't work very well with XSLX files so won't use them).
Don't get me wrong, I'm not absolving them of anything it's a stupid error and has led to two weeks of under reporting. However, I find the file size stuff difficult to believe, especially based on a source that politico have. Once again it's lack of communication from the government here that is causing the issue. If it is a file format issue then I can understand it happening, with a script written in a rush probably using csv.reader instead of openpyxl which is more difficult to handle and more prone to parse errors.
That sound very unlikely to me. Who would build an uploading system for such important data that performed no server-side verification at all? Surely you'd also have client-side verification to ensure that, at the bare minimum, the uploaded files had the correct extension!
It probably does do server side verification but which end user is going to stick around to see the result of that? I just don't think they've made a robust system made to handle every scenario you can throw at it.
There's a huge difference between building a system that works in a perfect environment and one that works everywhere. The former is probably what has been cooked up in a short space of time.
I'd honestly love to see the actual scripts they're using to put data into the database because they're probably extremely basic and probably don't have a fallback in case someone uploads an XSLX file.
Does the library they're using not throw an exception when it starts trying to loop through a zip file rather than a CSV?
It would, but it depends on who is monitoring the exceptions and whether the end user is actually notified of a script failure immediately or even by email if there is a queue.
Okay...but should the team monitoring the application also not be receiving this information
It boggles the mind even more when I think about it, that they were using columns.
It's literally easier to not use columns, why on Earth were they doing that
I just don't see how that's true, what is it even based on?
Do you mean the fact they were using columns or they fact it's easier to use rows rather than columns and MS makes it very clear what columns are used for
Today is our 35th wedding anniversary. We are out for afternoon tea at the Old Course hotel later. The planned walk is looking a little problematic, however. Some minor roads are closed with flooding around here.
Congrats! - Squeeze in a few holes too?
Lord no, what a waste of time that would be. I find golf just beyond tedious. The only good things about it are the walk and the outdoors.
The sandpit things on my local course are excellent for jumping my CRF250.
You go to extraordinary lengths to achieve a certain level of popularity, don't you? Remarkable.
Leave Dura-ace alone - he`s comedy gold.
When I was 18 me and my mate did donuts on a golf course in his dad's W115 220 automatic. It open diffed and blew up the torque convertor on the second loop. His dad was livid and sent him to Sunderland Polytechnic as punishment.
Great place Sunderland Poly. At least in it's previous incarnation! Met my wife there 61 years ago!
They have a nice law school there.
Thought you went Geordie, not Mackem?
There’s no difference between Geordies and Mackems.
Writes a Lancastrian.....
They have the same area code and in the same county.
They are synonyms for each other.
Nope. My area code is 01661. Represent.
So if I wanted to call Sunderland council and Newcastle council what are the first four digits?
0191 - but then again I'm not posh and living somewhere around Ponteland... Equally I think 0191 gives you Durham County Council and the preferred university for posh Oxford and Cambridge rejects.
Hey... I'll have you know that I live within the Newcastle upon Tyne city limits!
I suppose if they tried using a database it would have been MS-Access
It beggars belief that they are storing this data in a spreadsheet! Have they not heard of databases?
They aren't, the API runs from a Microsoft SQL database and the PowerBI visualisations run from it too.
I presume you're talking about the API for the Coronavirus Dashboard on gov.uk, whereas the issue with Excel storage seems to be further back in the data processing chain.
I don't think Excel is used except to generate the CSV upload files and the database is the source of truth rather than a series of Excel files. As I said the most likely source of error is people uploading XSLX files into a python script which will only work with CSVs. This has become an issue when third parties (universities) have had access so it stands to reason that the people doing the uploads didn't realise XSLX files won't work with python
The file size limitation makes no sense at all, Excel has over a million rows available and the new case per column doesn't make sense either because none of the days had more than 16.5k cases reported. Most likely Politico have a political source who also doesn't understand how these things actually work.
I've seen the XSLX into pyhon fuck up loads of times, it's definitely something that can happen in such a disparate system with hundreds of health trusts, testing centres and now universities all reporting in separately.
How does that work? Surely a Python script that is expecting a CSV file as input simply wouldn't work if given an XSLX file. The formats are completely different! There would surely be some indication that the process had failed, such as the lack of an output file, for a start.
Well it just doesn't work and it depends on what kind of error reporting he script has built in, what kind of queueing system there is and whether each upload is being properly monitored for success. As an end user (usually an admin assistant) I'm given instructions to prepare my upload in Excel with this structure and then to save it as a CSV and then upload it to the system. Loads of people are going to miss out the save as CSV step but won't stick around to see the error page (if there is one, as I said people who write the scripts like to assume that everyone understands that python doesn't work very well with XSLX files so won't use them).
Don't get me wrong, I'm not absolving them of anything it's a stupid error and has led to two weeks of under reporting. However, I find the file size stuff difficult to believe, especially based on a source that politico have. Once again it's lack of communication from the government here that is causing the issue. If it is a file format issue then I can understand it happening, with a script written in a rush probably using csv.reader instead of openpyxl which is more difficult to handle and more prone to parse errors.
That sound very unlikely to me. Who would build an uploading system for such important data that performed no server-side verification at all? Surely you'd also have client-side verification to ensure that, at the bare minimum, the uploaded files had the correct extension!
It probably does do server side verification but which end user is going to stick around to see the result of that? I just don't think they've made a robust system made to handle every scenario you can throw at it.
There's a huge difference between building a system that works in a perfect environment and one that works everywhere. The former is probably what has been cooked up in a short space of time.
I'd honestly love to see the actual scripts they're using to put data into the database because they're probably extremely basic and probably don't have a fallback in case someone uploads an XSLX file.
Does the library they're using not throw an exception when it starts trying to loop through a zip file rather than a CSV?
It would, but it depends on who is monitoring the exceptions and whether the end user is actually notified of a script failure immediately or even by email if there is a queue.
Okay...but should the team monitoring the application also not be receiving this information
One would imagine yes, but in either scenario (filesize or incorrect formatting) it seems that they weren't.
It boggles the mind even more when I think about it, that they were using columns.
It's literally easier to not use columns, why on Earth were they doing that
I just don't see how that's true, what is it even based on?
A Daily Mail article. Must be true.
I think it's more likely that the columns (even if that were the case) were the result of whatever software might be outputting TO the spreadsheet and/or some issues with file formats (e.g. opening and resaving!).
As you say it is impossible to think a human used 16,000 columns...
It boggles the mind even more when I think about it, that they were using columns.
It's literally easier to not use columns, why on Earth were they doing that
I just don't see how that's true, what is it even based on?
Do you mean the fact they were using columns or they fact it's easier to use rows rather than columns and MS makes it very clear what columns are used for
The source for them using columns and not rows? It's just so absurd
It boggles the mind even more when I think about it, that they were using columns.
It's literally easier to not use columns, why on Earth were they doing that
I just don't see how that's true, what is it even based on?
A Daily Mail article. Must be true.
I think it's more likely that the columns (even if that were the case) were the result of whatever software might be outputting TO the spreadsheet and/or some issues with file formats (e.g. opening and resaving!).
As you say it is impossible to think a human used 16,000 columns...
Okay but even in that case, to use a spreadsheet as your source of truth compared to the/a db is nuts.
PHE using Excel for data tells you all you need to know about their expertise.
More common than you think - the number of systems I've seen where you can upload data by loading spreadsheets.....
That sounds so awful.
The shit I've seen. When this is over, at the next PB beers I'll do a standup routine on "IT systems - the bad, the insane and the stuff with the wrong number of dimensions"
I know way too much about the Java Apache POI library. Awesome though it is - for *generating* spreadsheets.
Python is a great scripting language. But for serious computing...
What do you mean by serious? Launching a nuclear strike or some data wrangling?
For the latter I don't think it really matters that much what language you use. What matters is how well it is written and managed, and how well the writer understands what they are doing.
It does matter. One of the problems with Python is the culture of the "developers"*. The number of times I have had to explain concepts of code structure and testing to Python and Javascript... code writers..... There is a lot of "but it runs, ship it" Python hackers out there.
*Many I wouldn't class as real developers
Surely that's just an example of the writer not understanding what they are doing?
I agree there is a certain culture around different languages, but it doesn't have to be like that. I'm sure I could write really terrible stuff in any language.
It boggles the mind even more when I think about it, that they were using columns.
It's literally easier to not use columns, why on Earth were they doing that
I just don't see how that's true, what is it even based on?
Do you mean the fact they were using columns or they fact it's easier to use rows rather than columns and MS makes it very clear what columns are used for
Columns for anything other than headers. It's mind bogglingly stupid.
Not only have they made their job more difficult by using Excel, they're also creating all sorts of potential issues with data corruption and a single point of failure!
It boggles the mind even more when I think about it, that they were using columns.
It's literally easier to not use columns, why on Earth were they doing that
I just don't see how that's true, what is it even based on?
A Daily Mail article. Must be true.
I think it's more likely that the columns (even if that were the case) were the result of whatever software might be outputting TO the spreadsheet and/or some issues with file formats (e.g. opening and resaving!).
As you say it is impossible to think a human used 16,000 columns...
Okay but even in that case, to use a spreadsheet as your source of truth compared to the/a db is nuts.
I meant output by the local system for transfer to the national system. Obviously that isn't the way you SHOULD do things, but not impossible to imagine.
Not only have they made their job more difficult by using Excel, they're also creating all sorts of potential issues with data corruption and a single point of failure!
Don't be silly. There will be a backup file on Boris's desktop called "COVID-19 tests(1)(1) copy copy.xlsx".
It boggles the mind even more when I think about it, that they were using columns.
It's literally easier to not use columns, why on Earth were they doing that
I just don't see how that's true, what is it even based on?
A Daily Mail article. Must be true.
I think it's more likely that the columns (even if that were the case) were the result of whatever software might be outputting TO the spreadsheet and/or some issues with file formats (e.g. opening and resaving!).
As you say it is impossible to think a human used 16,000 columns...
Okay but even in that case, to use a spreadsheet as your source of truth compared to the/a db is nuts.
I meant output by the local system for transfer to the national system. Obviously that isn't the way you SHOULD do things, but not impossible to imagine.
Not only have they made their job more difficult by using Excel, they're also creating all sorts of potential issues with data corruption and a single point of failure!
Don't be silly. There will be a backup file on Boris's desktop called "COVID-19 tests(1)(1) copy copy.xlsx".
It boggles the mind even more when I think about it, that they were using columns.
It's literally easier to not use columns, why on Earth were they doing that
I just don't see how that's true, what is it even based on?
A Daily Mail article. Must be true.
I think it's more likely that the columns (even if that were the case) were the result of whatever software might be outputting TO the spreadsheet and/or some issues with file formats (e.g. opening and resaving!).
As you say it is impossible to think a human used 16,000 columns...
Okay but even in that case, to use a spreadsheet as your source of truth compared to the/a db is nuts.
I meant output by the local system for transfer to the national system. Obviously that isn't the way you SHOULD do things, but not impossible to imagine.
That would make much more sense. There's no way that the entire thing is being managed from one excel workbook.
It boggles the mind even more when I think about it, that they were using columns.
It's literally easier to not use columns, why on Earth were they doing that
I just don't see how that's true, what is it even based on?
A Daily Mail article. Must be true.
I think it's more likely that the columns (even if that were the case) were the result of whatever software might be outputting TO the spreadsheet and/or some issues with file formats (e.g. opening and resaving!).
As you say it is impossible to think a human used 16,000 columns...
Okay but even in that case, to use a spreadsheet as your source of truth compared to the/a db is nuts.
I meant output by the local system for transfer to the national system. Obviously that isn't the way you SHOULD do things, but not impossible to imagine.
That would make much more sense. There's no way that the entire thing is being managed from one excel workbook.
Why would you do that though? Do these systems not have APIs that can be used?
Comments
Weaning organisations off Excel is hard. The whole planet in infested with the stuff.
Don't get me wrong, I'm not absolving them of anything it's a stupid error and has led to two weeks of under reporting. However, I find the file size stuff difficult to believe, especially based on a source that politico have. Once again it's lack of communication from the government here that is causing the issue. If it is a file format issue then I can understand it happening, with a script written in a rush probably using csv.reader instead of openpyxl which is more difficult to handle and more prone to parse errors.
But only for those clients who can't import the data directly from their core systems...
They are synonyms for each other.
Though inevitably like the last few years of Blair and Brown his speech will be very much compared to Boris' tomorrow
Judging by the party split (Democrat), primary participation (Democrat leaning amongst indies) and age of the North Carolina early vote (Oldish) it looks like people have done just that.
After two team members tested positive this week, Napoli say they were ordered not to travel by their local health authority in Naples, the ASL. However, Italy's Serie A football league refused to call the game off. Napoli now face an automatic 3-0 defeat.
--------
His problem was that he was being sent a CSV file by a third party and didn't really pay any attention to how it was being generated. If you haven't written something to check the row counts daily, you might not spot the issue, especially if it is only updating existing data and all that happens is that some fields become stale.
It isn't a surprise that something like this would happen given it is all being done a bit on the hoof.
Too many seesaw states.
OK. How many States and how many ECs are likely to go Republican but have a Dem Governor and State legislature who can switch the votes? Enough to make a difference?
This is a totally different point from the BF rules, but entertaining nevertheless.
I know way too much about the Java Apache POI library. Awesome though it is - for *generating* spreadsheets.
Python is a great scripting language. But for serious computing...
https://twitter.com/bbclaurak/status/1313074854556467201?s=20
Writes a Essex boy!
If your script has been written in a rush and uses csv.reader to parse standardised CSV files it will work pretty reliably, especially in a closed system where everyone has been trained to use the system properly. It's unsurprising that this started to become an issue when third party access was granted to universities, the training probably wasn't very good and the instructions were probably ignored. I've only seen it happen about a million times.
For the latter I don't think it really matters that much what language you use. What matters is how well it is written and managed, and how well the writer understands what they are doing.
What we used to do was check the header row first, I imagine this is probably quite a typical thing to do. Anything unexpected would throw an exception.
But I suppose this is beyond them, considering they were adding each new case as a new COLUMN
There's a huge difference between building a system that works in a perfect environment and one that works everywhere. The former is probably what has been cooked up in a short space of time.
I'd honestly love to see the actual scripts they're using to put data into the database because they're probably extremely basic and probably don't have a fallback in case someone uploads an XSLX file.
But apart from that....
I would have thought the library would do this for you, in fact
https://twitter.com/eleanormargolis/status/1313064245798596608?s=21
*Many I wouldn't class as real developers
The old Farm Environment Plan form was easily the worst use of a spreadsheet I have ever seen.
It's literally easier to not use columns, why on Earth were they doing that
I think it's more likely that the columns (even if that were the case) were the result of whatever software might be outputting TO the spreadsheet and/or some issues with file formats (e.g. opening and resaving!).
As you say it is impossible to think a human used 16,000 columns...
I agree there is a certain culture around different languages, but it doesn't have to be like that. I'm sure I could write really terrible stuff in any language.
Energy companies want to use them to black you out when the above proposals don;t produce nearly enough for our needs.
Or are we just guesstimating?