Difference between revisions of "Talk:Hex"

From Second Life Wiki
Jump to navigation Jump to search
(easy to use, else correct at a glance, else small, else fast may be only one of two schools participating here)
Line 46: Line 46:


::P.S. The population of people confused by signed integers further subdivides. Significantly many of those people go farther, and objecting to dropping the more significant nybbles. Thus for those people a 32-bit field like x82000 should be expressed as x00082000, since all digital numbers have value and length. People then further subdivide according to people who accept the Arabic tradition of writing least to most significant from right to left, and people who don't, such as Alan Turing. Also people who copy the meaningless 0 in from C to write 0x rather than x or h, and people who put the x or h on the left or on the right.
::P.S. The population of people confused by signed integers further subdivides. Significantly many of those people go farther, and objecting to dropping the more significant nybbles. Thus for those people a 32-bit field like x82000 should be expressed as x00082000, since all digital numbers have value and length. People then further subdivide according to people who accept the Arabic tradition of writing least to most significant from right to left, and people who don't, such as Alan Turing. Also people who copy the meaningless 0 in from C to write 0x rather than x or h, and people who put the x or h on the left or on the right.
:::I will admit I am biased towards the way C does it. LSL was designed to mimic C++ and written in C++ but that isn't an excuse. Just because C or Python jumps off a cliff doesn't mean I will follow.
:::All good script communications systems are hidden or obscured. To listen in on the communications you need specialized code in place. I favor speed in my communications, so I use specialized comm debugging scripts that also decode the messages to a human readable form. In the process of decoding the messages they can identify flaws in the communication logic as well as provide the debugging information to the user. I rarely find flaws in the comm logic, I spend a lot of time designing and testing the algorithms I use. It's a bit of an obsession...
:::People rarely count in hex or octal, the vast majority of the uses for both involve low level programing; when working with the data in such a way that it is beneficial to see the bits. Negative hex obscures the bits... unless you are talking about floats. Floats don't do the two compliment for negative numbers, hex floats do not do a unsigned-signed conversions. Hex-floats are the bees knees when it comes to building floats by hand, and since LSL is compiled by GCC it supports hex-floats on the string2float explicit typecast.
:::<rant>A signed bit-field doesn't make sense, it's not supposed to be treated like a number but a collection of bits; treating it like a number and displaying it with a sign is totally confusing and makes it harder to read. Sorry but I do not relish the thought of having to do bit arithmetic in my head just to read a bit-field that uses bit 31.</rant>
:::As to the people you mentioned in your PS, we shouldn't be catering to people who think the common way of displaying numbers is wrong. Such a discussion is reviving long dead debates: its trolling. Catering to trolls is like asking for a sharp stick in the eye.
:::You are correct, we should be endeavoring to making the documentation more accessible to more people. -- [[User:Strife Onizuka|Strife Onizuka]] 21:38, 11 October 2007 (PDT)


2.
2.
Line 93: Line 105:


::Also relevant is Python "SF bug #1224347: hex longs now print with lowercase letters just like their int counterparts."
::Also relevant is Python "SF bug #1224347: hex longs now print with lowercase letters just like their int counterparts."
:::The page you reference does not in fact define the case the function should return. I did the google search because I wanted to see if anywhere in the documentation if they defined exactly how it should work. The point: you can't argue that it violates the spec if the spec is undefined, I did due diligence to find out what the spec was for the function and it turned up nothing of use.
:::C can print hex in both uppercase and lowercase with printf, %X and %x flags respectively. I don't actually care which case it is as long as it isn't changed at runtime. It's irresponsible programing to change it with [[llToLower]]/[[llToUpper]], there is no measurable benefit, it's a waste of CPU time. CPU time is a limited resource that everyone has to share. No raindrop thinks it's responsible for the flood. As the designers of these functions, we are responsible for the flood of lag they create.
:::I am sorry, I was referring to the changes to hexu as tiny, not the changes to hex; I should have been more specific. -- [[User:Strife Onizuka|Strife Onizuka]] 21:38, 11 October 2007 (PDT)

Revision as of 20:38, 11 October 2007

Second Talk

I've rewritten the article to try to include all the talk here thru now.

I'm curious to know if everyone agrees that we now present a http://en.wikipedia.org/wiki/Wikipedia:Neutral_point_of_view that accomodates our different perspectives on the relative measures and importance of such fuzzy source code qualities as:

  1. easy to use
  2. correct at a glance
  3. small
  4. fast

-- Ppaatt Lynagh 08:52, 11 October 2007 (PDT)

First Talk

The compiler doesn't optimize the output, for a function that is going to be used as a core function, speed & size should be the most important thing. Personally I wouldn't use hex, I'd only use hexu, LSL treats the upper bit as the sign bit; adding the sign is unnecessary for LSL. Of course who really transfers numbers between LSL scripts uses hex strings anyway? I use Base64 encoding for floats and just leave integers raw, or sometimes convert them to base64 too. Hex is just too slow to generate. -- Strife Onizuka 10:03, 10 October 2007 (PDT)

Two replies:

1.

I agree hexu and hex shouldn't be used as core functions for communication between scripts.

I agree we could/ should improve this hex article by adding a note to explain when people should prefer alternatives such as llIntegerToBase64.

I see you say " Of course who really transfers numbers between LSL scripts uses hex strings anyway?" Me, I call hex to show the hex to me, as an argument of llOwnerSay, when I'm learning stuff that involves hex, such as llGetObjectPermMask that we already now doc in terms of hex, e.g., saying what x2000 means in that context. Likely I'll also call hex when having the communication be in the form of LSL source fragments matters, as in Chatbot.

-- Ppaatt Lynagh 10:53, 10 October 2007 (PDT)

Using hex or hexu for user output is totally valid, I was only ruling out it's use for script to script communications. In an instance like displaying the value of llGetObjectPermMask to the user you wouldn't want to use hex because it is a bit-field, having it display the sign and flip the bits would be very confusing, so hexu would be a better choice for bit-fields. -- Strife Onizuka 20:28, 10 October 2007 (PDT)
Thanks for working to hear my every point. I think I understand and agree with every point you have made time to type out here, I think I am inviting you to consider a wider variety of LSL Wiki visitors.
What's confusing is relative - not absolute. What's confusing changes by what you already know.
To me, you sound like a person who long ago learned the difference between the C ideas of signed and unsigned integers and the two's complement encoding of negative integers that makes you expect to see xFF... as the hex corresponding to cast to unsigned from the signed integer -1. Yes people like that often find signed integers pointlessly confusing. Indeed LSL and Java and Python and so on annoy such people by matching early C and differing from late C in defining only a signed integer type without also declaring an unsigned integer type.
Significantly many other people arrive with no understanding of unsigned integers and no understanding of two's complement. Languages like the LSL function library accomodate such people by bizarrely often leaving the most significant bit clear and wasted. For example, PERM_ALL of LSL is not xFF... but is instead x7FF....
In sharp contrast to the two's-complement unsigned folk, the people who know only signed integers find it confusing when a routine defined to convert to hex also implicit converts to unsigned before converting to hex. Python got this wrong in the beginning, naturally siding first with the implementation people who naturally focus first on size and speed and AT&T C/ assembly tradition. Python eventually changed to get this right. For example, http://www.python.org/dev/peps/pep-0237/ discusses why not format "negative short ints" "as unsigned C long".
Thus, "using hex or hexu for user output is totally valid", yes. On "ruling out [its] use for script to script communications" again I can only follow you to a relative conclusion, not an absolute conclusion. Yes, when size and speed matter most, then yes people should turn instead to some other scheme. Situations which emphasise clarity over size and speed do occur significantly often. For example, the humans involved may feel the priority is to be able to reliably decode the script-to-script communication at a glance by eye. As for "an instance like displaying the value of llGetObjectPermMask to the user", the people who know only signed integers indeed do want hex to display the sign. They feel that hexu "flips the bits" by its implicit conversion to unsigned bitfield from signed bitfield, so hex and not hexu "would be a better choice for bit-fields".
Have I clearly distinguished the two populations of people involved? Can you see that the people who feel so differently than you do have equally well-functioning brains, just engaged with a different background? I think I have enough background myself to join either population, depending on context. I think we should tweak this hex article to suit both kinds of people, helping and trusting the reader to choose wisely according to either context that the reader holds.
-- Ppaatt Lynagh 07:08, 11 October 2007 (PDT)
P.S. The population of people confused by signed integers further subdivides. Significantly many of those people go farther, and objecting to dropping the more significant nybbles. Thus for those people a 32-bit field like x82000 should be expressed as x00082000, since all digital numbers have value and length. People then further subdivide according to people who accept the Arabic tradition of writing least to most significant from right to left, and people who don't, such as Alan Turing. Also people who copy the meaningless 0 in from C to write 0x rather than x or h, and people who put the x or h on the left or on the right.
I will admit I am biased towards the way C does it. LSL was designed to mimic C++ and written in C++ but that isn't an excuse. Just because C or Python jumps off a cliff doesn't mean I will follow.
All good script communications systems are hidden or obscured. To listen in on the communications you need specialized code in place. I favor speed in my communications, so I use specialized comm debugging scripts that also decode the messages to a human readable form. In the process of decoding the messages they can identify flaws in the communication logic as well as provide the debugging information to the user. I rarely find flaws in the comm logic, I spend a lot of time designing and testing the algorithms I use. It's a bit of an obsession...
People rarely count in hex or octal, the vast majority of the uses for both involve low level programing; when working with the data in such a way that it is beneficial to see the bits. Negative hex obscures the bits... unless you are talking about floats. Floats don't do the two compliment for negative numbers, hex floats do not do a unsigned-signed conversions. Hex-floats are the bees knees when it comes to building floats by hand, and since LSL is compiled by GCC it supports hex-floats on the string2float explicit typecast.
<rant>A signed bit-field doesn't make sense, it's not supposed to be treated like a number but a collection of bits; treating it like a number and displaying it with a sign is totally confusing and makes it harder to read. Sorry but I do not relish the thought of having to do bit arithmetic in my head just to read a bit-field that uses bit 31.</rant>
As to the people you mentioned in your PS, we shouldn't be catering to people who think the common way of displaying numbers is wrong. Such a discussion is reviving long dead debates: its trolling. Catering to trolls is like asking for a sharp stick in the eye.
You are correct, we should be endeavoring to making the documentation more accessible to more people. -- Strife Onizuka 21:38, 11 October 2007 (PDT)

2.

I'd like you to entertain the theory that I already understood every point you're making before you stated it, but that I'm inviting you to give attention to some other significant points also.

I think we're beginning to understand each other. I'm especially encouraged to see I guessed correctly that you would only ever call hexu, as you now confirm here. That is the choice that anyone who values mostly size and speed would make, yes.

I value clarity and conformity and size and speed, not just some of those, never always letting one trump all the others in every situation. For clarity in use and conformity to a widely accepted and much debated convention, I'd like to present a complete and correct implementation of the Python hex spec unchanged. I remember I created this page.

The Python hex spec has a documented history of moving from hexu to hex, as people who hold mainly to speed and size values clash with a wider community of people. At this moment our page here diverges from the Python hex spec only in that we return upper case at the hex level, same as at the hexu level. Perhaps we would all accept the compromise of calling hexu with the list of nybbles to return. Call with "0123456789ABCDEF" to get upper case back, call with "0123456789abcdef" to get lower case back, and meanwhile wish LSL let us specify default values for subroutine arguments.

For clarity in implementation, I liked having hex and hexu return lower and upper case respectively. Anyone understanding the code at a glance could tweak the code to get what they like, especially when assisted by comments like your comment that points people reading the code towards editing HEXC.

For clarity in implementation, I liked never using an assignment in a test. New people don't have to learn that idiom to read and correctly understand the code at a glance. Old people can trivially make that optimisation as they receive the code - it is a purely mechanical optimisation - people could even write a pre-processor to do that kind of transformation or even teach the client compiler to do it.

If you insist on presenting code optimised to the point of unreadability here, maybe we could compromise on presenting a couple of different versions together. Up front one version to read, then another version claiming to produce exactly the same results but smaller or quicker, both versions here together, rather than on confusingly separated pages like the Separate Words and ParseString2List pages.

Personally my history of pain in SL has been that I first run out of human time to read code, then I run out of script space to run code, then I run out of lag time to run code. I gather your history of pain has been different, I'm sorry to hear of your pain.

Am I yet more clear than mud? Wholly coherent? Partly persuasive?

-- Ppaatt Lynagh 10:39, 10 October 2007 (PDT)

Providing multiple versions is a good idea and I almost did but but the differences are so tiny I didn't see the point.
I've looked through [Google] Python's documentation, I can't find any reference to what case the output should be. This isn't Python, it's LSL, we should be writing code that is best suited for LSL. If it's not important enough to make it into Python's documentation then it was intended to be implementation specific. We can still be standard compliant and use a single case. LSL is limited by memory and execution speed. Conserving both is important. Keeping compliance with a perceived specification from another programing language isn't.
LSL is a bastard of a language (but so is Python but thats another discussion). In most programing languages, good programing practices revolve around breaking code up into readable chunks and using intermediate variables. Doing that in LSL is a bad programing practice, it slows down the code and wastes valuable memory. The LSL compile does not optimize code, intermediate variables are not optimized out, they stay on the stack until the function or event returns (or state changes). Variables declared in inner scopes are all given unique memory addresses on the stack, the address is not reused for other variables. Function calls are expensive in LSL, it's 18 bytes for a built-in function with no parameters or return, and 20 bytes for a user function with no parameters or return. The return costs 1 byte and each parameter costs how ever much to copy it onto the stack. Thats just the bytecode costs, there are stack costs too. LSL uses pass by value for every operation (the VM doesn't support pass by reference). The LSL memory space is 16KiB, computers haven't had that little memory to work in since the 1970's the concept of good programing practices wasn't even invented then.
Default arguments are a compiler thing, after it's compiled you wouldn't be able to tell the difference if it was a default or not by looking at the bytecode.
The case of hex digits is only important to the user; there isn't a technical reason for a distinction. In an environment so limited and hostile to the programmer as LSL is, concessions need to be made. Is something only the user might appreciate really necessary?
Constancy of case is easier for the user. Why should they return different cases? We shouldn't be inheriting other programing languages dogma without good reason. -- Strife Onizuka 20:28, 10 October 2007 (PDT)
Sorry I don't feel heard yet.
I think we're not talking about Python-izing LSL, since of course LSL remains LSL. I think we're talking about borrowing an idiom that's proven useful in Python to reuse in LSL.
The http://docs.python.org/lib/built-in-funcs.html already cited by the article since its first publication is the correct Python docs link for the page there that rudely buries the short definition of Python hex in amongst a pile of other stuff, I'm sorry Google doesn't know that yet.
The upper/ lower case issue in hex has decades of standing, pre Google. The article originally gave casual reference to this history, I quote, "the easier-to-type lower case nybbles a la AT&T, rather than the easier-to-read upper case nybbles a la IBM". Python prints lower case because C prints lower case. The doc doesn't discuss this design choice because C people, aka AT&T people, take that design choice for granted.
By your guesstimate that "the differences are so tiny" I'll likely return later to provide corrections to make spec = Python = demo = sample results, and then trust you to either leave the corrections in place or to add a second example, rather than just reverting them for a third time. The corrections matter to me because I care about more than size and speed, as explained above. I'll keep the hexu code factored out as an example of what people who care most about size and speed will choose to code.
Hope this helps, -- Ppaatt Lynagh 21:47, 10 October 2007 (PDT)
Also relevant is Python "SF bug #1224347: hex longs now print with lowercase letters just like their int counterparts."
The page you reference does not in fact define the case the function should return. I did the google search because I wanted to see if anywhere in the documentation if they defined exactly how it should work. The point: you can't argue that it violates the spec if the spec is undefined, I did due diligence to find out what the spec was for the function and it turned up nothing of use.
C can print hex in both uppercase and lowercase with printf, %X and %x flags respectively. I don't actually care which case it is as long as it isn't changed at runtime. It's irresponsible programing to change it with llToLower/llToUpper, there is no measurable benefit, it's a waste of CPU time. CPU time is a limited resource that everyone has to share. No raindrop thinks it's responsible for the flood. As the designers of these functions, we are responsible for the flood of lag they create.
I am sorry, I was referring to the changes to hexu as tiny, not the changes to hex; I should have been more specific. -- Strife Onizuka 21:38, 11 October 2007 (PDT)