Talk:Hex

From Second Life Wiki
Jump to navigation Jump to search

Principle Of Least Astonishment

In the history of the hex article I see the claim "programing shouldn't be astonishing". My history of pain and consequent bias includes that counterclaim that much human interaction with machines is frequently astonishing. See http://en.wikipedia.org/wiki/Principle_of_least_astonishment -- Ppaatt Lynagh 09:22, 14 October 2007 (PDT)

Due Diligence

Part of the difference between us is that my history of pain tells me it is irresponsible to claim "substantial" differences in how fast or small we made code, at the expense of the to-me-most-essential "correct at a glance" quality, without quoting reproducible numbers. When we don't publish a reproducible experimental setup, we're trading opinion, not science, hardly a worthwhile trade. When we mean to contribute the half-an-edit that is just the opinion to invite the community to finish our work by adding the science, then I think we should include an explicit link over here to this Talk:Hex tab. I see with surprise that Strife chose to delete my explicit invitations for the community to complete our unsubstantiated fast & small claims. -- Ppaatt Lynagh 09:16, 14 October 2007 (PDT)

Correct At A Glance

Strife,

I see you say "your opinions have helped clarify and shape my own opinions". Your talk has had the same effect on my talk. Thank you, for me that's great fun.

I do find marketing, journalism, popular science, etc. to be useful & worthwhile human activities with relevance to the community effort of making wikipedias maximally encyclopaedic, yes. So far as I know, that bias of mine is a difference between us.

I see you say "the article" "lacked" "a neutral point of view". I hope you know I emphatically agree. I hope you know I could not see this truth until after you pointed this truth out. For example, I did not understand the truth that "we find the first few implementations equally easy to call, because we have studied all those implementations enough to believe each actually does produce exactly the same output for any input" until after you explained. You explained that for you the implementations tuned to the point of not being correct at a glance were, for you, still easy to call. I see that now, but I didn't see that until after you showed me.

I agree that encyclopaedias should substitute a generic phrase in place of an equivalent trademarked phrase. I think "correct at a glance" is not a trademarked phrase. I think "correct at a glance" has a Google history of being used ~14,000 times or more, and says significantly more than "readable". I watched with interest as you pushing against me taught me to fill out the else clauses of the if-else-if of the hex routine, rather than letting hex shortcut them. I watched with interest as you pushing against me taught me to find my way to bits2nybbles as the least work core of this, rather than hexu. These are the experiences that taught me my bias is "correct at a glance", when contrasted with your "fast" and "small" bias. Over in User:Ppaatt_Lynagh I long ago said "Thanks to Strife Onizuka for helping to verbalise ... the "correct at a glance" distinctive of separateWords, etc."

I see you say 'if you do a Google search for '"Correct At A Glance"' my userpage comes up as hit #2'. I'm unable to reproduce this result? I checked the first few of the ~14,000 hits of http://www.google.com/search?q=%22correct+at+a+glance%22 and I checked the first few of the ~12,000,000 hits of http://www.google.com/search?q=correct+at+a+glance. Please can you describe your experiment in more detail? I agree my pro bono work to date has given me an unfair share of Google attention, and therefore gives my favourite catchphrases unusually many hits.

-- Ppaatt Lynagh 09:03, 14 October 2007 (PDT)

I was wondering if those other hits were you (but decided it would be too much like stalking to ask). I didn't think it was a TM or it would have come up as one... just sounded like it should be one. It's probably google learning that I do a lot of SL related searches and elevating those. screen-shot. I'd like to say the reason I didn't give reasons before was that I didn't have anything nice to say but really I'm just not in the habit of writing explanations. I feel I was a bit harsh with you. -- Strife Onizuka 09:33, 14 October 2007 (PDT)

Second Talk

I've rewritten the article to try to include all the talk here thru now.

I'm curious to know if everyone agrees that we now present a http://en.wikipedia.org/wiki/Wikipedia:Neutral_point_of_view that accomodates our different perspectives on the relative measures and importance of such fuzzy source code qualities as:

  1. easy to use
  2. correct at a glance
  3. small
  4. fast

-- Ppaatt Lynagh 08:52, 11 October 2007 (PDT)

It looks good. I'll write a tight version. -- Strife Onizuka 21:42, 11 October 2007 (PDT)
Thanks for making time to say we are progressing towards consensus on an NPOV. I'm happy to hear that. I have to turn my attention elsewhere awhile, but now all the more I look forward to returning to help again soon, maybe as soon as next week. -- Ppaatt Lynagh 07:04, 12 October 2007 (PDT)
I see here you promised, and in the article you delivered, a small implementation to present alongside the correct-at-a-glance and fast implemenations. Thank you. -- Ppaatt Lynagh 06:34, 13 October 2007 (PDT)
I'm also surprised & disconcerted. I see you made more than one change, with no comment in the change history on why and no talk here to explain why. I specifically see you rewrote the section headings and moved some of the design rationale up front. I don't know why you did this. To my eye, these changes injure the clarity of the article: they distract the reader into dispute, rather than presenting all consensus up front. With no talk here to prepare me, I'm left guessing wildly how you could have felt these changes were improvements. I'll do my best to accept the spirit of what you may have had in mind, while also restoring the qualities of the article that matter more to me. I'll comment my every change, and as you can see I began here first with more talk. Hope this helps, -- Ppaatt Lynagh 06:34, 13 October 2007 (PDT)
So by now you can see my guesses of your intent include:
1. you want the page to grow to also discuss other specifications for conversion to hex,
2. you contributed an example in that direction as well as an example of a small implementation,
3. you want more emphasis on how all correctly understood implementations that conform to the same specification share the quality of being easy to call
As the page evolves towards an ever more neutral and complete point of view, I've tried to retain those qualities in particular. -- Ppaatt Lynagh 07:28, 13 October 2007 (PDT)
I'm going to respond in no particular order to both your comments and revision because it's tough to put things into a chronological order. Your opinions have helped clarify and shape my own opinions, sometimes they are instep with each other, other times not. My main reason for the edit was to give the article a neutral point of view, which it lacked. It was biased towards a specific single implementation and was single implementation centric. I'm not blaming you, the shape of the article has a lot to do with how it was formed; we must now roll with the punches.
I reworked the ordering and description because if the user is to choose a version of the code, they need to know the design rationality; they need to know why one block of code is better then the other for the particular situation. The purpose of the section at the top was to make clear that no one block of code was perfect for all situations. You can't put that at the way bottom when you present four different versions, it needs to be at the forefront to explain why there are four different versions. The reason for the section retitling was so that it better fitted with the new description; in an optimization context the titles didn't quite fit. Readability has a documented technical history in computer programing (check Wikipedia); "Correct At A Glance" on the other hand sounds like a marketing catchphrase that needs a (TM) after it (if you do a Google search for '"Correct At A Glance"' your userpage comes up as hit #2).

[ http://en.wikipedia.org/wiki/Readability#In_programming may have been meant. ]

Your comment about "wading thru" puts forth the notion that there is a debate and it needs a resolution through a single implementation. There does not need to be a debate, nor a single implementation. A single implementation would not do justice to the LSL or programing community in general. By releasing only one implementation we force a solution on the populace that is not always appropriate, by providing different implementations optimized for common use-cases this can be ameliorated. Your comment I see thus as being anti-productive towards a neutral point of view, seeing that is biased to a particular outcome.
The first three functions are all equally easy to use, they have identical interfaces, claiming Ease Of Use on a single one of them is disadvantaging the rest. Ease Of Use are weasel words in this case due to their difficulty to qualify.
I disagree with the statement "... at the expense of other qualities", it is a generalization that implies that it is impossible to optimize the code for several if not all qualities. The sentence that follows it should be stripped outright, "we" is a weasel word for "I" and shows your bias, if you can't be bothered to test the other versions you shouldn't be logging accusations against them on something that can't be qualified.
Currently there is only one LSL compiler and it does not optimize code (unless the folks at OpenSim/LibSL have finished theirs and I don't think they are writing an optimizing compiler regardless). The compiler is quite predictable in both its LSO and CLI output: it does not optimize code, it has not optimized code for 4 years, there have been no sounds from LL that it will ever output optimized code. LL at this time is waiting for Mono before they make any changes to the language. There has been community noise about opening SL to other CIL enabled languages, LL has been very receptive to this, if it happens few improvements are likely to happen to LSL. I bring this up because any discussion about Mono and "naive compilation" on a article not about the compiler are off-topic. I'm going to suggest that "naive compilation" be stripped from the article outright, it is a term that is not widely used and does not appear in any glossary (that I could find with google; i doubt any other search engine is going to be any better). Your use of "naive compilation" suggests that LSL can be compiled in a different fashion, which is not the case; it is damning with faint praise. If we were to be fair and keep the "naive compilation" sentiment it would need to be applied to "Correct At A Glance" in a most prejudicial fashion for it to be accurate.
The forth function you butchered the meaning of the text I wrote. It's not how many times you call the function that provides a bytecode savings, it's how many places in the code the function is explicitly called that matters; lets not forget to mention that it's 0 to 8 times not 0 to 7 ("no more then 8 places").

[ The "fourth" function likely was meant. ]

User experience does not meet the metric for scientific review that looking at the bytecode can provide. You unequal application of sentiments about Mono is prejudicial. The Mono compiler is already available (look in the client source) and detailed posts have been made on the forums and blog about how the Mono VM will be used. If my memory serves me correctly LL is executing the CIL bytecodes one at a time so I doubt they will be able to take advantage of any compiler optimizations done by the VM. That is a discussion for another article. Mono is a big unknown and shouldn't be mentioned in articles except when there is some issue about if the function will work properly. All the functions on the hex page should work properly in Mono, none of them exacerbate the right-to-left left-to-right ordering change.
Taking into consideration what we have said, I shall do a revision. Hopefully we have overcome the big hurdles.
-- Strife Onizuka 07:47, 14 October 2007 (PDT)
Please ignore the Ease of Call argument, it seems I misread the section (I missed the word 'few'). I'm still going to strike it anyway, it's not like you have to feed the functions specially prebuilt structures; they are all more or less equally easy to call. -- Strife Onizuka 08:09, 14 October 2007 (PDT)
Sorry I'm not sure I understand how to correctly ignore part of what you have posted and not yet made time to strike. Yes I do find clear brief specification a necessary element in making code easy to call.
You can see I've happily engaged your "correct at a glance" thoughts separately, far above.
Yes me echoing text in my own words to demonstrate my understanding butchers the text horribly when the text made no sense to me.
Yes I run in horror from the passive voice, e.g., "are typically characterized" even to the extreme of typing the word "we" into the article. I like typing the word "we" as an experiment. If you can't subscribe to the "we" statements as published, then you & I have not yet achieved a consensus neutral point of view.
I do believe forcing the reader to wade thru many implementations is pointless. I liked the article better when we presented only correct at a glance and fast. I am now especially pained to find the demo lost so far down beyond so many implementations. I guess you disagree somehow, I don't yet understand why, at first I thought only fast mattered to you, now you look more to be saying that everything except correct at a glance matters to you. I think it's fun that in the history of this article you and I both chose to present one implementation, but different implementations. I think I see I presented only a correct-at-a-glance implementation, trusting the reader to deploy and adapt that code reasonably. I think I see you substituted only a fast implementation, trusting the reader to reverse-engineer that code reasonably.
I now see you saying that multiple implementations is a service - a kind of menu from which readers will choose reasonably. From a journalism/ marketing point of view, I think readers are more likely to abandon the article altogether as rudely complex - more noise than signal.
The comments on the article history capture our agreement that we shouldn't claim that our inability to max all qualities simultaneously is inescapable. My "at the expense of other qualities" phrase was my attempt to echo your own words back at you, in my own words to show my understanding and confirm our having reached a neutral point of view. Your words were "Depending upon the situation one priority may trump another. There are four principal priorities in LSL in contention with each other." On review I can read that in two ways. At the time, I read that to be saying we should claim that our inability to max all qualities simultaneously is inescapable. Sorry I misunderstood.
Thanks again for helping me think, -- Ppaatt Lynagh 09:03, 14 October 2007 (PDT)

First Talk

The compiler doesn't optimize the output, for a function that is going to be used as a core function, speed & size should be the most important thing. Personally I wouldn't use hex, I'd only use hexu, LSL treats the upper bit as the sign bit; adding the sign is unnecessary for LSL. Of course who really transfers numbers between LSL scripts uses hex strings anyway? I use Base64 encoding for floats and just leave integers raw, or sometimes convert them to base64 too. Hex is just too slow to generate. -- Strife Onizuka 10:03, 10 October 2007 (PDT)

Two replies:

1.

I agree hexu and hex shouldn't be used as core functions for communication between scripts.

I agree we could/ should improve this hex article by adding a note to explain when people should prefer alternatives such as llIntegerToBase64.

I see you say " Of course who really transfers numbers between LSL scripts uses hex strings anyway?" Me, I call hex to show the hex to me, as an argument of llOwnerSay, when I'm learning stuff that involves hex, such as llGetObjectPermMask that we already now doc in terms of hex, e.g., saying what x2000 means in that context. Likely I'll also call hex when having the communication be in the form of LSL source fragments matters, as in Chatbot.

-- Ppaatt Lynagh 10:53, 10 October 2007 (PDT)

Using hex or hexu for user output is totally valid, I was only ruling out it's use for script to script communications. In an instance like displaying the value of llGetObjectPermMask to the user you wouldn't want to use hex because it is a bit-field, having it display the sign and flip the bits would be very confusing, so hexu would be a better choice for bit-fields. -- Strife Onizuka 20:28, 10 October 2007 (PDT)
Thanks for working to hear my every point. I think I understand and agree with every point you have made time to type out here, I think I am inviting you to consider a wider variety of LSL Wiki visitors.
What's confusing is relative - not absolute. What's confusing changes by what you already know.
To me, you sound like a person who long ago learned the difference between the C ideas of signed and unsigned integers and the two's complement encoding of negative integers that makes you expect to see xFF... as the hex corresponding to cast to unsigned from the signed integer -1. Yes people like that often find signed integers pointlessly confusing. Indeed LSL and Java and Python and so on annoy such people by matching early C and differing from late C in defining only a signed integer type without also declaring an unsigned integer type.
Significantly many other people arrive with no understanding of unsigned integers and no understanding of two's complement. Languages like the LSL function library accomodate such people by bizarrely often leaving the most significant bit clear and wasted. For example, PERM_ALL of LSL is not xFF... but is instead x7FF....
In sharp contrast to the two's-complement unsigned folk, the people who know only signed integers find it confusing when a routine defined to convert to hex also implicit converts to unsigned before converting to hex. Python got this wrong in the beginning, naturally siding first with the implementation people who naturally focus first on size and speed and AT&T C/ assembly tradition. Python eventually changed to get this right. For example, http://www.python.org/dev/peps/pep-0237/ discusses why not format "negative short ints" "as unsigned C long".
Thus, "using hex or hexu for user output is totally valid", yes. On "ruling out [its] use for script to script communications" again I can only follow you to a relative conclusion, not an absolute conclusion. Yes, when size and speed matter most, then yes people should turn instead to some other scheme. Situations which emphasise clarity over size and speed do occur significantly often. For example, the humans involved may feel the priority is to be able to reliably decode the script-to-script communication at a glance by eye. As for "an instance like displaying the value of llGetObjectPermMask to the user", the people who know only signed integers indeed do want hex to display the sign. They feel that hexu "flips the bits" by its implicit conversion to unsigned bitfield from signed bitfield, so hex and not hexu "would be a better choice for bit-fields".
Have I clearly distinguished the two populations of people involved? Can you see that the people who feel so differently than you do have equally well-functioning brains, just engaged with a different background? I think I have enough background myself to join either population, depending on context. I think we should tweak this hex article to suit both kinds of people, helping and trusting the reader to choose wisely according to either context that the reader holds.
-- Ppaatt Lynagh 07:08, 11 October 2007 (PDT)
P.S. The population of people confused by signed integers further subdivides. Significantly many of those people go farther, and objecting to dropping the more significant nybbles. Thus for those people a 32-bit field like x82000 should be expressed as x00082000, since all digital numbers have value and length. People then further subdivide according to people who accept the Arabic tradition of writing least to most significant from right to left, and people who don't, such as Alan Turing. Also people who copy the meaningless 0 in from C to write 0x rather than x or h, and people who put the x or h on the left or on the right.
I will admit I am biased towards the way C does it. LSL was designed to mimic C++ and written in C++ but that isn't an excuse. Just because C or Python jumps off a cliff doesn't mean I will follow.
All good script communications systems are hidden or obscured. To listen in on the communications you need specialized code in place. I favor speed in my communications, so I use specialized comm debugging scripts that also decode the messages to a human readable form. In the process of decoding the messages they can identify flaws in the communication logic as well as provide the debugging information to the user. I rarely find flaws in the comm logic, I spend a lot of time designing and testing the algorithms I use. It's a bit of an obsession...
People rarely count in hex or octal, the vast majority of the uses for both involve low level programing; when working with the data in such a way that it is beneficial to see the bits. Negative hex obscures the bits... unless you are talking about floats. Floats don't do the two compliment for negative numbers, hex floats do not do a unsigned-signed conversions. Hex-floats are the bees knees when it comes to building floats by hand, and since LSL is compiled by GCC it supports hex-floats on the string2float explicit typecast.
<rant>A signed bit-field doesn't make sense, it's not supposed to be treated like a number but a collection of bits; treating it like a number and displaying it with a sign is totally confusing and makes it harder to read. Sorry but I do not relish the thought of having to do bit arithmetic in my head just to read a bit-field that uses bit 31.</rant>
As to the people you mentioned in your PS, we shouldn't be catering to people who think the common way of displaying numbers is wrong. Such a discussion is reviving long dead debates: its trolling. Catering to trolls is like asking for a sharp stick in the eye.
You are correct, we should be endeavoring to making the documentation more accessible to more people. -- Strife Onizuka 21:38, 11 October 2007 (PDT)
Yes we differ sharply in bias, though of course the bias of my school of poets is right. :-)
Yes we disagree strongly over the wisdom of unsigning ints, the value of obscurity, the definition of trolls, etc. Google tells me Python has a history of abandoning hex floats, I don't know why. I remember APL had a history of difficulties in communicating floats. We agree that inviting trolls to join is not helpful. -- Ppaatt Lynagh 07:40, 13 October 2007 (PDT)

2.

I'd like you to entertain the theory that I already understood every point you're making before you stated it, but that I'm inviting you to give attention to some other significant points also.

I think we're beginning to understand each other. I'm especially encouraged to see I guessed correctly that you would only ever call hexu, as you now confirm here. That is the choice that anyone who values mostly size and speed would make, yes.

I value clarity and conformity and size and speed, not just some of those, never always letting one trump all the others in every situation. For clarity in use and conformity to a widely accepted and much debated convention, I'd like to present a complete and correct implementation of the Python hex spec unchanged. I remember I created this page.

The Python hex spec has a documented history of moving from hexu to hex, as people who hold mainly to speed and size values clash with a wider community of people. At this moment our page here diverges from the Python hex spec only in that we return upper case at the hex level, same as at the hexu level. Perhaps we would all accept the compromise of calling hexu with the list of nybbles to return. Call with "0123456789ABCDEF" to get upper case back, call with "0123456789abcdef" to get lower case back, and meanwhile wish LSL let us specify default values for subroutine arguments.

For clarity in implementation, I liked having hex and hexu return lower and upper case respectively. Anyone understanding the code at a glance could tweak the code to get what they like, especially when assisted by comments like your comment that points people reading the code towards editing HEXC.

For clarity in implementation, I liked never using an assignment in a test. New people don't have to learn that idiom to read and correctly understand the code at a glance. Old people can trivially make that optimisation as they receive the code - it is a purely mechanical optimisation - people could even write a pre-processor to do that kind of transformation or even teach the client compiler to do it.

If you insist on presenting code optimised to the point of unreadability here, maybe we could compromise on presenting a couple of different versions together. Up front one version to read, then another version claiming to produce exactly the same results but smaller or quicker, both versions here together, rather than on confusingly separated pages like the Separate Words and ParseString2List pages.

Personally my history of pain in SL has been that I first run out of human time to read code, then I run out of script space to run code, then I run out of lag time to run code. I gather your history of pain has been different, I'm sorry to hear of your pain.

Am I yet more clear than mud? Wholly coherent? Partly persuasive?

-- Ppaatt Lynagh 10:39, 10 October 2007 (PDT)

Providing multiple versions is a good idea and I almost did but but the differences are so tiny I didn't see the point.
I've looked through [Google] Python's documentation, I can't find any reference to what case the output should be. This isn't Python, it's LSL, we should be writing code that is best suited for LSL. If it's not important enough to make it into Python's documentation then it was intended to be implementation specific. We can still be standard compliant and use a single case. LSL is limited by memory and execution speed. Conserving both is important. Keeping compliance with a perceived specification from another programing language isn't.
LSL is a bastard of a language (but so is Python but thats another discussion). In most programing languages, good programing practices revolve around breaking code up into readable chunks and using intermediate variables. Doing that in LSL is a bad programing practice, it slows down the code and wastes valuable memory. The LSL compile does not optimize code, intermediate variables are not optimized out, they stay on the stack until the function or event returns (or state changes). Variables declared in inner scopes are all given unique memory addresses on the stack, the address is not reused for other variables. Function calls are expensive in LSL, it's 18 bytes for a built-in function with no parameters or return, and 20 bytes for a user function with no parameters or return. The return costs 1 byte and each parameter costs how ever much to copy it onto the stack. Thats just the bytecode costs, there are stack costs too. LSL uses pass by value for every operation (the VM doesn't support pass by reference). The LSL memory space is 16KiB, computers haven't had that little memory to work in since the 1970's the concept of good programing practices wasn't even invented then.
Default arguments are a compiler thing, after it's compiled you wouldn't be able to tell the difference if it was a default or not by looking at the bytecode.
The case of hex digits is only important to the user; there isn't a technical reason for a distinction. In an environment so limited and hostile to the programmer as LSL is, concessions need to be made. Is something only the user might appreciate really necessary?
Constancy of case is easier for the user. Why should they return different cases? We shouldn't be inheriting other programing languages dogma without good reason. -- Strife Onizuka 20:28, 10 October 2007 (PDT)
Sorry I don't feel heard yet.
I think we're not talking about Python-izing LSL, since of course LSL remains LSL. I think we're talking about borrowing an idiom that's proven useful in Python to reuse in LSL.
The http://docs.python.org/lib/built-in-funcs.html already cited by the article since its first publication is the correct Python docs link for the page there that rudely buries the short definition of Python hex in amongst a pile of other stuff, I'm sorry Google doesn't know that yet.
The upper/ lower case issue in hex has decades of standing, pre Google. The article originally gave casual reference to this history, I quote, "the easier-to-type lower case nybbles a la AT&T, rather than the easier-to-read upper case nybbles a la IBM". Python prints lower case because C prints lower case. The doc doesn't discuss this design choice because C people, aka AT&T people, take that design choice for granted.
By your guesstimate that "the differences are so tiny" I'll likely return later to provide corrections to make spec = Python = demo = sample results, and then trust you to either leave the corrections in place or to add a second example, rather than just reverting them for a third time. The corrections matter to me because I care about more than size and speed, as explained above. I'll keep the hexu code factored out as an example of what people who care most about size and speed will choose to code.
Hope this helps, -- Ppaatt Lynagh 21:47, 10 October 2007 (PDT)
Also relevant is Python "SF bug #1224347: hex longs now print with lowercase letters just like their int counterparts."
The page you reference does not in fact define the case the function should return. I did the google search because I wanted to see if anywhere in the documentation if they defined exactly how it should work. The point: you can't argue that it violates the spec if the spec is undefined, I did due diligence to find out what the spec was for the function and it turned up nothing of use.
C can print hex in both uppercase and lowercase with printf, %X and %x flags respectively. I don't actually care which case it is as long as it isn't changed at runtime. It's irresponsible programing to change it with llToLower/llToUpper, there is no measurable benefit, it's a waste of CPU time. CPU time is a limited resource that everyone has to share. No raindrop thinks it's responsible for the flood. As the designers of these functions, we are responsible for the flood of lag they create.
I am sorry, I was referring to the changes to hexu as tiny, not the changes to hex; I should have been more specific. -- Strife Onizuka 21:38, 11 October 2007 (PDT)
Yes we disagree strongly over the meaning of the pain of lag, the meaning of the pages referenced, the definition of due diligence, and the expository value of llToLower. We agree that bits2nybbles and hexu and hex in the article are by now very different things. -- Ppaatt Lynagh 07:40, 13 October 2007 (PDT)