toot.wales is one of the many independent Mastodon servers you can use to participate in the fediverse.
We are the Open Social network for Wales and the Welsh, at home and abroad! Y rhwydwaith cymdeithasol annibynnol i Gymru, wedi'i bweru gan Mastodon!

Administered by:

Server stats:

609
active users

a pox on whoever came up with the fucking byte order mark

so we have several google docs for the script of kitsune tails. i download them as txt files, and a batch file concatenates them with the `type` command. then i feed the concatenated full script to our script tool

anyway turns out when you download txt files from google docs they stick a fucking byte order mark in em. so concatenating them just sticks random BOMs all through your document. i never noticed because of a fluke of the tool

so anyway now i've added code to strip BOMs from the start of any line but jesus christ this is why there should not be byte order marks, ever, and if your software is putting them into plaintext files in this, the year of our lord 2024 then you deserve to get reality checked right into the pavement

@eniko It's too late and I am about to go bed .. one day I'll tell you my story with UNICODE and "let's have a bunch of different people using different PCs/OSes/etc. doing some translations" .. I thought "unicode is unicode is a standard no ?" .. erm NO .. maybe unicode is but not the way is saved/handled by Mac, Windwos, Office, OpenOffice, other things .. actually there's UNICODE16 .. UNICODE32 .. and well .. "never the same document twice" .. and yes BOM yes, some put it, some not ..

@eniko In essence I found out "the hard way" that if you are dealing with Windows, Mac, Linux, Notepad, Wordpad, Word, OpenOffice, Mac version of Word, Mac version of OpenOffice, Excel and you try to combine "N WHAT-SHOULD-BE PLAIN TEXT FILES .TXT" into one .XLS to be exported into .CSV and re-imported as 'strings' via CPP into a game .. you are in for a STRAMASSIVE PAIN IN THE ... also "UNICODE" nd "WCHAR" are not quite the same nor UNICODE16/32 .. Actually I think UNICODE talks ONLY about CONT

@eniko CONT "the code points" i.e. "AE" is encoded into 0x0f3adbcc or such, that's all it says, it DOES NOT say "how your machine/app is going to store it in files/whatever", nor mentions about BOM or what order bytes will be on files/disk nor "if your app may decide to append a header or crap to it" .. it only says "AE" is going to be that UNICODE value. Then there's UTF-8/16 ... another smaller can of worms and the WCHAR that is not quite UNICODE .. and don't make me start on Arabic I CONT.

Giles Goat

@eniko CONT. NEVER been able to make it work correctly and you'll find out in arabic if you write "AB" that get "melted" into some "glyph" let's say '0' but then you say OH NO why it's from left to right should be right to left and you write "BA" and see what happens .. ( it's NOT '0' any more ) .. try to cut and paste arabic from google translate and tell me if you get it right .. but yeah text is LOT MORE COMPLEX than first looks say ! Multilang is a nightmare ! Not to mention formatting.