9 Comments

Yup. I first noticed this problem in 1973. C should always be the last choice, to be used only when it is the best choice.

Expand full comment

This is an interesting post because it highlights a widely held misconception about C. The fact that C has such primitive string functions is not a flaw of the language that will ever be fixed as this is not a flaw at all! You are expecting C to be something it's not and that is safe functional modern language. C is a systems programming language and if you want to use it in a safe application space you must provide the 'safety' yourself. Adding more kludges to the language isn't helping. If you want all that safety built in use a different language.

Expand full comment

Idk, what does the Powershell kanji stuff have to do with anything?

strlen(有り難う) returning 12 feels rather natural to me. I don't think I ever cared about the number of Unicode points in a string in my life. Most of the time, all I want to know is how big it's in memory. Sometimes I care about how big it's on a screen, but that's impacted by font choice and other things outside of scope for any strlen-type functionality. Besides, the existence of Unicode modifier symbols kinda raises the question of whether "the number of Unicode points in a string" is a well-defined operation to begin with.

Expand full comment

The output of strcmp should be different as you’ve got two ‘%s’ there. 👍

Expand full comment

Hm, I got emoji output to work on console by going like

SetConsoleCP(CP_UTF8); SetConsoleOutputCP(CP_UTF8)

like at https://github.com/DDR0/Wincrawl/blob/master/Wincrawl2/io.cpp#L24, maybe that'd help with Japanese too?

Expand full comment

int main() {

char source[] = "Hello, world!";

char* destination = source;

strcpy(destination, source); // Copy the source string to the destination string

printf("Source: %s\n", source);

printf("Destination: %s\n", destination);

return 0;

}

This copies the source into itself! the first example with char destination[20]; will actually create a new 20 character string in memory, this second iteration points destination back to the source.

Expand full comment

You need to change the windows code page from 1252 to utf-8 but this will break mssql during it's upgrades. Windows, so fun

Expand full comment

I think "有り難う" means "thank you" instead of "hello" :)

Expand full comment

Doh! I had worked shopped a few examples and forgot to switch it out. Thanks for pointing it out!

Expand full comment