Difference between revisions of "Talk:Hex"

From Second Life Wiki
Jump to navigation Jump to search
(claim that those weird people who like signed integers do exist and do matter)
(add mention of people who write hex with leading zeroes, people who choose x or h, people who write lil- or big- endian)
Line 28: Line 28:


::Have I clearly distinguished the two populations of people involved? Can you see that the people who feel so differently than you do have equally well-functioning brains, just engaged with a different background? I think I have enough background myself to join either population, depending on context. I think we should tweak this hex article to suit both kinds of people, helping and trusting the reader to choose wisely according to either context that the reader holds.
::Have I clearly distinguished the two populations of people involved? Can you see that the people who feel so differently than you do have equally well-functioning brains, just engaged with a different background? I think I have enough background myself to join either population, depending on context. I think we should tweak this hex article to suit both kinds of people, helping and trusting the reader to choose wisely according to either context that the reader holds.
::-- [[User:Ppaatt Lynagh|Ppaatt Lynagh]] 07:08, 11 October 2007 (PDT)
::P.S. The population of people confused by signed integers further subdivides. Significantly many of those people go farther, and objecting to dropping the more significant nybbles. Thus for those people a 32-bit field like x82000 should be expressed as x00082000, since all digital numbers have value and length. People then further subdivide according to people who accept the Arabic tradition of writing least to most significant from right to left, and people who don't, such as Alan Turing. Also people who copy the meaningless 0 in from C to write 0x rather than x or h, and people who put the x or h on the left or on the right.


2.
2.

Revision as of 07:08, 11 October 2007

The compiler doesn't optimize the output, for a function that is going to be used as a core function, speed & size should be the most important thing. Personally I wouldn't use hex, I'd only use hexu, LSL treats the upper bit as the sign bit; adding the sign is unnecessary for LSL. Of course who really transfers numbers between LSL scripts uses hex strings anyway? I use Base64 encoding for floats and just leave integers raw, or sometimes convert them to base64 too. Hex is just too slow to generate. -- Strife Onizuka 10:03, 10 October 2007 (PDT)

Two replies:

1.

I agree hexu and hex shouldn't be used as core functions for communication between scripts.

I agree we could/ should improve this hex article by adding a note to explain when people should prefer alternatives such as llIntegerToBase64.

I see you say " Of course who really transfers numbers between LSL scripts uses hex strings anyway?" Me, I call hex to show the hex to me, as an argument of llOwnerSay, when I'm learning stuff that involves hex, such as llGetObjectPermMask that we already now doc in terms of hex, e.g., saying what x2000 means in that context. Likely I'll also call hex when having the communication be in the form of LSL source fragments matters, as in Chatbot.

-- Ppaatt Lynagh 10:53, 10 October 2007 (PDT)

Using hex or hexu for user output is totally valid, I was only ruling out it's use for script to script communications. In an instance like displaying the value of llGetObjectPermMask to the user you wouldn't want to use hex because it is a bit-field, having it display the sign and flip the bits would be very confusing, so hexu would be a better choice for bit-fields. -- Strife Onizuka 20:28, 10 October 2007 (PDT)
Thanks for working to hear my every point. I think I understand and agree with every point you have made time to type out here, I think I am inviting you to consider a wider variety of LSL Wiki visitors.
What's confusing is relative - not absolute. What's confusing changes by what you already know.
To me, you sound like a person who long ago learned the difference between the C ideas of signed and unsigned integers and the two's complement encoding of negative integers that makes you expect to see xFF... as the hex corresponding to cast to unsigned from the signed integer -1. Yes people like that often find signed integers pointlessly confusing. Indeed LSL and Java and Python and so on annoy such people by matching early C and differing from late C in defining only a signed integer type without also declaring an unsigned integer type.
Significantly many other people arrive with no understanding of unsigned integers and no understanding of two's complement. Languages like the LSL function library accomodate such people by bizarrely often leaving the most significant bit clear and wasted. For example, PERM_ALL of LSL is not xFF... but is instead x7FF....
In sharp contrast to the two's-complement unsigned folk, the people who know only signed integers find it confusing when a routine defined to convert to hex also implicit converts to unsigned before converting to hex. Python got this wrong in the beginning, naturally siding first with the implementation people who naturally focus first on size and speed and AT&T C/ assembly tradition. Python eventually changed to get this right. For example, http://www.python.org/dev/peps/pep-0237/ discusses why not format "negative short ints" "as unsigned C long".
Thus, "using hex or hexu for user output is totally valid", yes. On "ruling out [its] use for script to script communications" again I can only follow you to a relative conclusion, not an absolute conclusion. Yes, when size and speed matter most, then yes people should turn instead to some other scheme. Situations which emphasise clarity over size and speed do occur significantly often. For example, the humans involved may feel the priority is to be able to reliably decode the script-to-script communication at a glance by eye. As for "an instance like displaying the value of llGetObjectPermMask to the user", the people who know only signed integers indeed do want hex to display the sign. They feel that hexu "flips the bits" by its implicit conversion to unsigned bitfield from signed bitfield, so hex and not hexu "would be a better choice for bit-fields".
Have I clearly distinguished the two populations of people involved? Can you see that the people who feel so differently than you do have equally well-functioning brains, just engaged with a different background? I think I have enough background myself to join either population, depending on context. I think we should tweak this hex article to suit both kinds of people, helping and trusting the reader to choose wisely according to either context that the reader holds.
-- Ppaatt Lynagh 07:08, 11 October 2007 (PDT)
P.S. The population of people confused by signed integers further subdivides. Significantly many of those people go farther, and objecting to dropping the more significant nybbles. Thus for those people a 32-bit field like x82000 should be expressed as x00082000, since all digital numbers have value and length. People then further subdivide according to people who accept the Arabic tradition of writing least to most significant from right to left, and people who don't, such as Alan Turing. Also people who copy the meaningless 0 in from C to write 0x rather than x or h, and people who put the x or h on the left or on the right.

2.

I'd like you to entertain the theory that I already understood every point you're making before you stated it, but that I'm inviting you to give attention to some other significant points also.

I think we're beginning to understand each other. I'm especially encouraged to see I guessed correctly that you would only ever call hexu, as you now confirm here. That is the choice that anyone who values mostly size and speed would make, yes.

I value clarity and conformity and size and speed, not just some of those, never always letting one trump all the others in every situation. For clarity in use and conformity to a widely accepted and much debated convention, I'd like to present a complete and correct implementation of the Python hex spec unchanged. I remember I created this page.

The Python hex spec has a documented history of moving from hexu to hex, as people who hold mainly to speed and size values clash with a wider community of people. At this moment our page here diverges from the Python hex spec only in that we return upper case at the hex level, same as at the hexu level. Perhaps we would all accept the compromise of calling hexu with the list of nybbles to return. Call with "0123456789ABCDEF" to get upper case back, call with "0123456789abcdef" to get lower case back, and meanwhile wish LSL let us specify default values for subroutine arguments.

For clarity in implementation, I liked having hex and hexu return lower and upper case respectively. Anyone understanding the code at a glance could tweak the code to get what they like, especially when assisted by comments like your comment that points people reading the code towards editing HEXC.

For clarity in implementation, I liked never using an assignment in a test. New people don't have to learn that idiom to read and correctly understand the code at a glance. Old people can trivially make that optimisation as they receive the code - it is a purely mechanical optimisation - people could even write a pre-processor to do that kind of transformation or even teach the client compiler to do it.

If you insist on presenting code optimised to the point of unreadability here, maybe we could compromise on presenting a couple of different versions together. Up front one version to read, then another version claiming to produce exactly the same results but smaller or quicker, both versions here together, rather than on confusingly separated pages like the Separate Words and ParseString2List pages.

Personally my history of pain in SL has been that I first run out of human time to read code, then I run out of script space to run code, then I run out of lag time to run code. I gather your history of pain has been different, I'm sorry to hear of your pain.

Am I yet more clear than mud? Wholly coherent? Partly persuasive?

-- Ppaatt Lynagh 10:39, 10 October 2007 (PDT)

Providing multiple versions is a good idea and I almost did but but the differences are so tiny I didn't see the point.
I've looked through [Google] Python's documentation, I can't find any reference to what case the output should be. This isn't Python, it's LSL, we should be writing code that is best suited for LSL. If it's not important enough to make it into Python's documentation then it was intended to be implementation specific. We can still be standard compliant and use a single case. LSL is limited by memory and execution speed. Conserving both is important. Keeping compliance with a perceived specification from another programing language isn't.
LSL is a bastard of a language (but so is Python but thats another discussion). In most programing languages, good programing practices revolve around breaking code up into readable chunks and using intermediate variables. Doing that in LSL is a bad programing practice, it slows down the code and wastes valuable memory. The LSL compile does not optimize code, intermediate variables are not optimized out, they stay on the stack until the function or event returns (or state changes). Variables declared in inner scopes are all given unique memory addresses on the stack, the address is not reused for other variables. Function calls are expensive in LSL, it's 18 bytes for a built-in function with no parameters or return, and 20 bytes for a user function with no parameters or return. The return costs 1 byte and each parameter costs how ever much to copy it onto the stack. Thats just the bytecode costs, there are stack costs too. LSL uses pass by value for every operation (the VM doesn't support pass by reference). The LSL memory space is 16KiB, computers haven't had that little memory to work in since the 1970's the concept of good programing practices wasn't even invented then.
Default arguments are a compiler thing, after it's compiled you wouldn't be able to tell the difference if it was a default or not by looking at the bytecode.
The case of hex digits is only important to the user; there isn't a technical reason for a distinction. In an environment so limited and hostile to the programmer as LSL is, concessions need to be made. Is something only the user might appreciate really necessary?
Constancy of case is easier for the user. Why should they return different cases? We shouldn't be inheriting other programing languages dogma without good reason. -- Strife Onizuka 20:28, 10 October 2007 (PDT)
Sorry I don't feel heard yet.
I think we're not talking about Python-izing LSL, since of course LSL remains LSL. I think we're talking about borrowing an idiom that's proven useful in Python to reuse in LSL.
The http://docs.python.org/lib/built-in-funcs.html already cited by the article since its first publication is the correct Python docs link for the page there that rudely buries the short definition of Python hex in amongst a pile of other stuff, I'm sorry Google doesn't know that yet.
The upper/ lower case issue in hex has decades of standing, pre Google. The article originally gave casual reference to this history, I quote, "the easier-to-type lower case nybbles a la AT&T, rather than the easier-to-read upper case nybbles a la IBM". Python prints lower case because C prints lower case. The doc doesn't discuss this design choice because C people, aka AT&T people, take that design choice for granted.
By your guesstimate that "the differences are so tiny" I'll likely return later to provide corrections to make spec = Python = demo = sample results, and then trust you to either leave the corrections in place or to add a second example, rather than just reverting them for a third time. The corrections matter to me because I care about more than size and speed, as explained above. I'll keep the hexu code factored out as an example of what people who care most about size and speed will choose to code.
Hope this helps, -- Ppaatt Lynagh 21:47, 10 October 2007 (PDT)
Also relevant is Python "SF bug #1224347: hex longs now print with lowercase letters just like their int counterparts."