Talk:Hex

From Second Life Wiki
Jump to navigation Jump to search

Second Talk

I've rewritten the article to try to include all the talk here thru now.

I'm curious to know if everyone agrees that we now present a http://en.wikipedia.org/wiki/Wikipedia:Neutral_point_of_view that accomodates our different perspectives on the relative measures and importance of such fuzzy source code qualities as:

  1. easy to use
  2. correct at a glance
  3. small
  4. fast

-- Ppaatt Lynagh 08:52, 11 October 2007 (PDT)

It looks good. I'll write a tight version. -- Strife Onizuka 21:42, 11 October 2007 (PDT)
Thanks for making time to say we are progressing towards consensus on an NPOV. I'm happy to hear that. I have to turn my attention elsewhere awhile, but now all the more I look forward to returning to help again soon, maybe as soon as next week. -- Ppaatt Lynagh 07:04, 12 October 2007 (PDT)
I see here you promised, and in the article you delivered, a small implementation to present alongside the correct-at-a-glance and fast implemenations. Thank you. -- Ppaatt Lynagh 06:34, 13 October 2007 (PDT)
I'm also surprised & disconcerted. I see you made more than one change, with no comment in the change history on why and no talk here to explain why. I specifically see you rewrote the section headings and moved some of the design rationale up front. I don't know why you did this. To my eye, these changes injure the clarity of the article: they distract the reader into dispute, rather than presenting all consensus up front. With no talk here to prepare me, I'm left guessing wildly how you could have felt these changes were improvements. I'll do my best to accept the spirit of what you may have had in mind, while also restoring the qualities of the article that matter more to me. I'll comment my every change, and as you can see I began here first with more talk. Hope this helps, -- Ppaatt Lynagh 06:34, 13 October 2007 (PDT)
So by now you can see my guesses of your intent include:
1. you want the page to grow to also discuss other specifications for conversion to hex,
2. you contributed an example in that direction as well as an example of a small implementation,
3. you want more emphasis on how all correctly understood implementations that conform to the same specification share the quality of being easy to call
As the page evolves towards an ever more neutral and complete point of view, I've tried to retain those qualities in particular. -- Ppaatt Lynagh 07:28, 13 October 2007 (PDT)
I'm going to respond in no particular order to both your comments and revision because it's tough to put things into a chronological order. Your opinions have helped clarify and shape my own opinions, sometimes they are instep with each other, other times not. My main reason for the edit was to give the article a neutral point of view, which it lacked. It was biased towards a specific single implementation and was single implementation centric. I'm not blaming you, the shape of the article has a lot to do with how it was formed; we must now roll with the punches.
I reworked the ordering and description because if the user is to choose a version of the code, they need to know the design rationality; they need to know why one block of code is better then the other for the particular situation. The purpose of the section at the top was to make clear that no one block of code was perfect for all situations. You can't put that at the way bottom when you present four different versions, it needs to be at the forefront to explain why there are four different versions. The reason for the section retitling was so that it better fitted with the new description; in an optimization context the titles didn't quite fit. Readability has a documented technical history in computer programing (check Wikipedia); "Correct At A Glance" on the other hand sounds like a marketing catchphrase that needs a (TM) after it (if you do a Google search for '"Correct At A Glance"' your userpage comes up as hit #2).
Your comment about "wading thru" puts forth the notion that there is a debate and it needs a resolution through a single implementation. There does not need to be a debate, nor a single implementation. A single implementation would not do justice to the LSL or programing community in general. By releasing only one implementation we force a solution on the populace that is not always appropriate, by providing different implementations optimized for common use-cases this can be ameliorated. Your comment I see thus as being anti-productive towards a neutral point of view, seeing that is biased to a particular outcome.
The first three functions are all equally easy to use, they have identical interfaces, claiming Ease Of Use on a single one of them is disadvantaging the rest. Ease Of Use are weasel words in this case due to their difficulty to qualify.
I disagree with the statement "... at the expense of other qualities", it is a generalization that implies that it is impossible to optimize the code for several if not all qualities. The sentence that follows it should be stripped outright, "we" is a weasel word for "I" and shows your bias, if you can't be bothered to test the other versions you shouldn't be logging accusations against them on something that can't be qualified.
Currently there is only one LSL compiler and it does not optimize code (unless the folks at OpenSim/LibSL have finished theirs and I don't think they are writing an optimizing compiler regardless). The compiler is quite predictable in both its LSO and CLI output: it does not optimize code, it has not optimized code for 4 years, there have been no sounds from LL that it will ever output optimized code. LL at this time is waiting for Mono before they make any changes to the language. There has been community noise about opening SL to other CIL enabled languages, LL has been very receptive to this, if it happens few improvements are likely to happen to LSL. I bring this up because any discussion about Mono and "naive compilation" on a article not about the compiler are off-topic. I'm going to suggest that "naive compilation" be stripped from the article outright, it is a term that is not widely used and does not appear in any glossary (that I could find with google; i doubt any other search engine is going to be any better). Your use of "naive compilation" suggests that LSL can be compiled in a different fashion, which is not the case; it is damning with faint praise. If we were to be fair and keep the "naive compilation" sentiment it would need to be applied to "Correct At A Glance" in a most prejudicial fashion for it to be accurate.
The forth function you butchered the meaning of the text I wrote. It's not how many times you call the function that provides a bytecode savings, it's how many places in the code the function is explicitly called that matters; lets not forget to mention that it's 0 to 8 times not 0 to 7 ("no more then 8 places").
User experience does not meet the metric for scientific review that looking at the bytecode can provide. You unequal application of sentiments about Mono is prejudicial. The Mono compiler is already available (look in the client source) and detailed posts have been made on the forums and blog about how the Mono VM will be used. If my memory serves me correctly LL is executing the CIL bytecodes one at a time so I doubt they will be able to take advantage of any compiler optimizations done by the VM. That is a discussion for another article. Mono is a big unknown and shouldn't be mentioned in articles except when there is some issue about if the function will work properly. All the functions on the hex page should work properly in Mono, none of them exacerbate the right-to-left left-to-right ordering change.
Taking into consideration what we have said, I shall do a revision. Hopefully we have overcome the big hurdles.
-- Strife Onizuka 07:47, 14 October 2007 (PDT)

First Talk

The compiler doesn't optimize the output, for a function that is going to be used as a core function, speed & size should be the most important thing. Personally I wouldn't use hex, I'd only use hexu, LSL treats the upper bit as the sign bit; adding the sign is unnecessary for LSL. Of course who really transfers numbers between LSL scripts uses hex strings anyway? I use Base64 encoding for floats and just leave integers raw, or sometimes convert them to base64 too. Hex is just too slow to generate. -- Strife Onizuka 10:03, 10 October 2007 (PDT)

Two replies:

1.

I agree hexu and hex shouldn't be used as core functions for communication between scripts.

I agree we could/ should improve this hex article by adding a note to explain when people should prefer alternatives such as llIntegerToBase64.

I see you say " Of course who really transfers numbers between LSL scripts uses hex strings anyway?" Me, I call hex to show the hex to me, as an argument of llOwnerSay, when I'm learning stuff that involves hex, such as llGetObjectPermMask that we already now doc in terms of hex, e.g., saying what x2000 means in that context. Likely I'll also call hex when having the communication be in the form of LSL source fragments matters, as in Chatbot.

-- Ppaatt Lynagh 10:53, 10 October 2007 (PDT)

Using hex or hexu for user output is totally valid, I was only ruling out it's use for script to script communications. In an instance like displaying the value of llGetObjectPermMask to the user you wouldn't want to use hex because it is a bit-field, having it display the sign and flip the bits would be very confusing, so hexu would be a better choice for bit-fields. -- Strife Onizuka 20:28, 10 October 2007 (PDT)
Thanks for working to hear my every point. I think I understand and agree with every point you have made time to type out here, I think I am inviting you to consider a wider variety of LSL Wiki visitors.
What's confusing is relative - not absolute. What's confusing changes by what you already know.
To me, you sound like a person who long ago learned the difference between the C ideas of signed and unsigned integers and the two's complement encoding of negative integers that makes you expect to see xFF... as the hex corresponding to cast to unsigned from the signed integer -1. Yes people like that often find signed integers pointlessly confusing. Indeed LSL and Java and Python and so on annoy such people by matching early C and differing from late C in defining only a signed integer type without also declaring an unsigned integer type.
Significantly many other people arrive with no understanding of unsigned integers and no understanding of two's complement. Languages like the LSL function library accomodate such people by bizarrely often leaving the most significant bit clear and wasted. For example, PERM_ALL of LSL is not xFF... but is instead x7FF....
In sharp contrast to the two's-complement unsigned folk, the people who know only signed integers find it confusing when a routine defined to convert to hex also implicit converts to unsigned before converting to hex. Python got this wrong in the beginning, naturally siding first with the implementation people who naturally focus first on size and speed and AT&T C/ assembly tradition. Python eventually changed to get this right. For example, http://www.python.org/dev/peps/pep-0237/ discusses why not format "negative short ints" "as unsigned C long".
Thus, "using hex or hexu for user output is totally valid", yes. On "ruling out [its] use for script to script communications" again I can only follow you to a relative conclusion, not an absolute conclusion. Yes, when size and speed matter most, then yes people should turn instead to some other scheme. Situations which emphasise clarity over size and speed do occur significantly often. For example, the humans involved may feel the priority is to be able to reliably decode the script-to-script communication at a glance by eye. As for "an instance like displaying the value of llGetObjectPermMask to the user", the people who know only signed integers indeed do want hex to display the sign. They feel that hexu "flips the bits" by its implicit conversion to unsigned bitfield from signed bitfield, so hex and not hexu "would be a better choice for bit-fields".
Have I clearly distinguished the two populations of people involved? Can you see that the people who feel so differently than you do have equally well-functioning brains, just engaged with a different background? I think I have enough background myself to join either population, depending on context. I think we should tweak this hex article to suit both kinds of people, helping and trusting the reader to choose wisely according to either context that the reader holds.
-- Ppaatt Lynagh 07:08, 11 October 2007 (PDT)
P.S. The population of people confused by signed integers further subdivides. Significantly many of those people go farther, and objecting to dropping the more significant nybbles. Thus for those people a 32-bit field like x82000 should be expressed as x00082000, since all digital numbers have value and length. People then further subdivide according to people who accept the Arabic tradition of writing least to most significant from right to left, and people who don't, such as Alan Turing. Also people who copy the meaningless 0 in from C to write 0x rather than x or h, and people who put the x or h on the left or on the right.
I will admit I am biased towards the way C does it. LSL was designed to mimic C++ and written in C++ but that isn't an excuse. Just because C or Python jumps off a cliff doesn't mean I will follow.
All good script communications systems are hidden or obscured. To listen in on the communications you need specialized code in place. I favor speed in my communications, so I use specialized comm debugging scripts that also decode the messages to a human readable form. In the process of decoding the messages they can identify flaws in the communication logic as well as provide the debugging information to the user. I rarely find flaws in the comm logic, I spend a lot of time designing and testing the algorithms I use. It's a bit of an obsession...
People rarely count in hex or octal, the vast majority of the uses for both involve low level programing; when working with the data in such a way that it is beneficial to see the bits. Negative hex obscures the bits... unless you are talking about floats. Floats don't do the two compliment for negative numbers, hex floats do not do a unsigned-signed conversions. Hex-floats are the bees knees when it comes to building floats by hand, and since LSL is compiled by GCC it supports hex-floats on the string2float explicit typecast.
<rant>A signed bit-field doesn't make sense, it's not supposed to be treated like a number but a collection of bits; treating it like a number and displaying it with a sign is totally confusing and makes it harder to read. Sorry but I do not relish the thought of having to do bit arithmetic in my head just to read a bit-field that uses bit 31.</rant>
As to the people you mentioned in your PS, we shouldn't be catering to people who think the common way of displaying numbers is wrong. Such a discussion is reviving long dead debates: its trolling. Catering to trolls is like asking for a sharp stick in the eye.
You are correct, we should be endeavoring to making the documentation more accessible to more people. -- Strife Onizuka 21:38, 11 October 2007 (PDT)
Yes we differ sharply in bias, though of course the bias of my school of poets is right. :-)
Yes we disagree strongly over the wisdom of unsigning ints, the value of obscurity, the definition of trolls, etc. Google tells me Python has a history of abandoning hex floats, I don't know why. I remember APL had a history of difficulties in communicating floats. We agree that inviting trolls to join is not helpful. -- Ppaatt Lynagh 07:40, 13 October 2007 (PDT)

2.

I'd like you to entertain the theory that I already understood every point you're making before you stated it, but that I'm inviting you to give attention to some other significant points also.

I think we're beginning to understand each other. I'm especially encouraged to see I guessed correctly that you would only ever call hexu, as you now confirm here. That is the choice that anyone who values mostly size and speed would make, yes.

I value clarity and conformity and size and speed, not just some of those, never always letting one trump all the others in every situation. For clarity in use and conformity to a widely accepted and much debated convention, I'd like to present a complete and correct implementation of the Python hex spec unchanged. I remember I created this page.

The Python hex spec has a documented history of moving from hexu to hex, as people who hold mainly to speed and size values clash with a wider community of people. At this moment our page here diverges from the Python hex spec only in that we return upper case at the hex level, same as at the hexu level. Perhaps we would all accept the compromise of calling hexu with the list of nybbles to return. Call with "0123456789ABCDEF" to get upper case back, call with "0123456789abcdef" to get lower case back, and meanwhile wish LSL let us specify default values for subroutine arguments.

For clarity in implementation, I liked having hex and hexu return lower and upper case respectively. Anyone understanding the code at a glance could tweak the code to get what they like, especially when assisted by comments like your comment that points people reading the code towards editing HEXC.

For clarity in implementation, I liked never using an assignment in a test. New people don't have to learn that idiom to read and correctly understand the code at a glance. Old people can trivially make that optimisation as they receive the code - it is a purely mechanical optimisation - people could even write a pre-processor to do that kind of transformation or even teach the client compiler to do it.

If you insist on presenting code optimised to the point of unreadability here, maybe we could compromise on presenting a couple of different versions together. Up front one version to read, then another version claiming to produce exactly the same results but smaller or quicker, both versions here together, rather than on confusingly separated pages like the Separate Words and ParseString2List pages.

Personally my history of pain in SL has been that I first run out of human time to read code, then I run out of script space to run code, then I run out of lag time to run code. I gather your history of pain has been different, I'm sorry to hear of your pain.

Am I yet more clear than mud? Wholly coherent? Partly persuasive?

-- Ppaatt Lynagh 10:39, 10 October 2007 (PDT)

Providing multiple versions is a good idea and I almost did but but the differences are so tiny I didn't see the point.
I've looked through [Google] Python's documentation, I can't find any reference to what case the output should be. This isn't Python, it's LSL, we should be writing code that is best suited for LSL. If it's not important enough to make it into Python's documentation then it was intended to be implementation specific. We can still be standard compliant and use a single case. LSL is limited by memory and execution speed. Conserving both is important. Keeping compliance with a perceived specification from another programing language isn't.
LSL is a bastard of a language (but so is Python but thats another discussion). In most programing languages, good programing practices revolve around breaking code up into readable chunks and using intermediate variables. Doing that in LSL is a bad programing practice, it slows down the code and wastes valuable memory. The LSL compile does not optimize code, intermediate variables are not optimized out, they stay on the stack until the function or event returns (or state changes). Variables declared in inner scopes are all given unique memory addresses on the stack, the address is not reused for other variables. Function calls are expensive in LSL, it's 18 bytes for a built-in function with no parameters or return, and 20 bytes for a user function with no parameters or return. The return costs 1 byte and each parameter costs how ever much to copy it onto the stack. Thats just the bytecode costs, there are stack costs too. LSL uses pass by value for every operation (the VM doesn't support pass by reference). The LSL memory space is 16KiB, computers haven't had that little memory to work in since the 1970's the concept of good programing practices wasn't even invented then.
Default arguments are a compiler thing, after it's compiled you wouldn't be able to tell the difference if it was a default or not by looking at the bytecode.
The case of hex digits is only important to the user; there isn't a technical reason for a distinction. In an environment so limited and hostile to the programmer as LSL is, concessions need to be made. Is something only the user might appreciate really necessary?
Constancy of case is easier for the user. Why should they return different cases? We shouldn't be inheriting other programing languages dogma without good reason. -- Strife Onizuka 20:28, 10 October 2007 (PDT)
Sorry I don't feel heard yet.
I think we're not talking about Python-izing LSL, since of course LSL remains LSL. I think we're talking about borrowing an idiom that's proven useful in Python to reuse in LSL.
The http://docs.python.org/lib/built-in-funcs.html already cited by the article since its first publication is the correct Python docs link for the page there that rudely buries the short definition of Python hex in amongst a pile of other stuff, I'm sorry Google doesn't know that yet.
The upper/ lower case issue in hex has decades of standing, pre Google. The article originally gave casual reference to this history, I quote, "the easier-to-type lower case nybbles a la AT&T, rather than the easier-to-read upper case nybbles a la IBM". Python prints lower case because C prints lower case. The doc doesn't discuss this design choice because C people, aka AT&T people, take that design choice for granted.
By your guesstimate that "the differences are so tiny" I'll likely return later to provide corrections to make spec = Python = demo = sample results, and then trust you to either leave the corrections in place or to add a second example, rather than just reverting them for a third time. The corrections matter to me because I care about more than size and speed, as explained above. I'll keep the hexu code factored out as an example of what people who care most about size and speed will choose to code.
Hope this helps, -- Ppaatt Lynagh 21:47, 10 October 2007 (PDT)
Also relevant is Python "SF bug #1224347: hex longs now print with lowercase letters just like their int counterparts."
The page you reference does not in fact define the case the function should return. I did the google search because I wanted to see if anywhere in the documentation if they defined exactly how it should work. The point: you can't argue that it violates the spec if the spec is undefined, I did due diligence to find out what the spec was for the function and it turned up nothing of use.
C can print hex in both uppercase and lowercase with printf, %X and %x flags respectively. I don't actually care which case it is as long as it isn't changed at runtime. It's irresponsible programing to change it with llToLower/llToUpper, there is no measurable benefit, it's a waste of CPU time. CPU time is a limited resource that everyone has to share. No raindrop thinks it's responsible for the flood. As the designers of these functions, we are responsible for the flood of lag they create.
I am sorry, I was referring to the changes to hexu as tiny, not the changes to hex; I should have been more specific. -- Strife Onizuka 21:38, 11 October 2007 (PDT)
Yes we disagree strongly over the meaning of the pain of lag, the meaning of the pages referenced, the definition of due diligence, and the expository value of llToLower. We agree that bits2nybbles and hexu and hex in the article are by now very different things. -- Ppaatt Lynagh 07:40, 13 October 2007 (PDT)