[RndTbl] PHP undefined vars / array indices

John Lange john at johnlange.ca
Thu Jan 13 19:30:02 CST 2022


Thanks for the correction Adam (and btw, it's also valid in C++). So I
should have said "*you would only ever see "!$i++" syntax in weakly-typed
languages*". And also, I should never say never...

John

On Thu, Jan 13, 2022 at 4:33 PM Adam Thompson <athompso at athompso.net> wrote:

> I don’t really have a horse in this race, but I think John make one
> factual error.  It’s minor and doesn’t change his point, but:
>
>     if(!i++) { ... }
>
> is perfectly valid in C, where 0==false.  It’s equivalent to (I think…):
>
>     if(i==0) { i++; ... } else { i++; ... };
>
> IIRC, C has the comma operator, albeit thankfully rarely used, so if you
> really wanted to do this, I think it could be slightly better written:
>
>     if(i==0, i++) { ... }
>
> I could be wrong, I haven’t attempted to write C code in 25+ years.  It’s
> not something you would want to see, but you certainly *could*.
>
>
>
> FWIW, I agree with both sides in the original debate: PHP’s ultra-weak
> typing and initialization lay pervasive traps for programmers, whether they
> be incompetent, lazy, tired, or merely distracted.  (And based on my lived
> experience, the overwhelming majority of programmers – nay, **people** in
> general – are NOT “highly competent”: they’re “good enough”.  But at the
> same time, this is a seemingly-gratuitous language change that reminds me
> VERY strongly of system all of a sudden being shoved down everyone’s
> throats, even though systemd introduces some good features that were
> previously lacking.  And list($a,$b,$c)=… is a useful notation in PHP.
>
>
>
> Perhaps this is like Perl 6, where it’s nearly a completely different
> language from the previous version, sharing the name and basic syntax?  But
> at least there’s still a team patching security holes in Perl 5.  I have no
> confidence that would happen with PHP.
>
> -Adam
>
>
>
> *From:* Roundtable <roundtable-bounces at muug.ca> *On Behalf Of *John Lange
> *Sent:* Thursday, January 13, 2022 2:18 PM
> *To:* Continuation of Round Table discussion <roundtable at muug.ca>
> *Subject:* Re: [RndTbl] PHP undefined vars / array indices
>
>
>
> As a former professional PHP programmer and current hobbyist programmer
> (not in PHP though), I agree with Trevor. (disclaimer: I did not go back
> and re-read all the PHP threads on this topic).
>
>
>
> PHP made a fundamental change to the way the language works which breaks
> backwards compatibility and has not provided any concrete evidence that
> supports the published justification for this change. "style" and "best
> practice" really are just opinions.
>
>
>
> I also don't see any reason why PHP could not have defined a
> global "use_strict= true/false" parameter similar to the approach that Perl
> took years back. The default could even be "true", if they want to
> emphasize the importance of it.
>
>
>
> However, I do agree that it's not good programming practice. Consider this
> example:
>
>
>
> function Foo {
>  while ( $i < 5 )
>   if (!$i++) {}
>
>   // ... (a whole bunch more lines of code go here) ..
>
>
>
>  while ( $i < 5 ) // inadvertently using the same variable because $i is
> your favorite 'counter' and you forgot you already used it
>   if (!$i++) {  } // This line never runs
>
> }
>
>
>
> On a side note, you would only ever see "!$i++" syntax in non-declarative
> languages like PHP. It makes no sense otherwise since integers can never be
> false. Aside from that, my personal preference is to code for readability
> and I find the statement is hard to interpret. So I prefer:
>
>
>
> If ($i == false) {
>
>   $i++
>
>   ... other stuff
>
> }
>
>
>
> But never the less the point is I agree that PHP should not have broken
> backward compatibility. By doing so it will force many sites to remain on
> PHP 7.x thereby opening up the very real possibility that a 7.x security
> vulnerability will get exploited and cause mass-grief (log4j anyone?).
>
>
>
> John
>
>
>
>
>
> On Thu, Jan 13, 2022 at 3:04 AM Trevor Cordes <trevor at tecnopolis.ca>
> wrote:
>
> On 2022-01-10 J. King wrote:
> > I sympathize with your plight as you're dealing with an older code-
>
> Thank you for your constructive and well-reasoned reply.  It is nice to
> have another person to discuss this important issue with.  (I apologize
> in advance for the length of this reply.)
>
> I think it's helpful to dissect the issue into three parts: the
> risk posed by the original status quo, and the pain of change, and the
> freedom of the programmer.  In advance, I will readily grant that the
> uninitialized value (UV) problem is more often a possible "mistake" than
> the uninitialized array index (UAI) one.  I would also guess it is much
> more rare.  At least that is what my attempts to "fix" my code has
> revealed.  But everything I discuss applies pretty much equally to both.
> And the PHP8 RFC, though it separated the issues, ended up lumping them
> into the same result: turning them both into warnings.  Thus it follows
> that the arguments used to vilify one (UV) must closely or somewhat
> apply to the other (UAI).
>
> I've been using PHP (in production) since 1999 and v3.  I'm nearly
> positive UV and UAIs were not even a notice back then.  In fact, with
> the (yes, evil) register_globals they got rid of a long time ago (more
> below), the whole idea of knowing if a var was initialized or not was
> impossible (without jumping through the isset hoops which almost nobody
> did, especially since ?? didn't yet exist).
>
> So let's establish that PHP from v1 through v3 not only didn't care, it
> kind of mandated the programmer not check for uninitialized variables
> (UV).  In other words, it not only said it was "ok", it said "this is
> the way it's done".  I can go into any of my ancient PHP3 books (from
> many various publishers) and guarantee you I will not find one example
> that bothers to initialize a variable when there was no reason to.
> (See example #0 at bottom.)
>
> And that's ok, because many other loose, untyped, scripting languages
> that were popular around 1999 were exactly the same way (e.g. perl).  In
> fact, I would posit that they were designed this way!  It wasn't an
> oversight or laziness: it was the desire to have the programming
> language do more of the work for you, and to reduce the code line count
> (vs C) required to get a job done (hence why the scripting languages
> were considered "rapid" and (often) "prototype").
>
> Were the developers of PHP in 1999 ignorant?  Or were they trying to
> create a language suited for a purpose in a way that was similar to its
> peers and required "less work" than the heavy alternatives?
>
> So what's really the problem here?  All I hear from the "pro" side is
> "times have changed", "it's not good practice", "it's not robust",
> "legacy", "possible logic errors", "doing things wrong".  And they are
> saying the original makers of PHP, all the book authors from back then,
> and a generation of programmers who used it that way were/are wrong and
> must change (certainly by the time the "warning" is turned into a
> runtime error).  OK, someone could have that opinion, I get it,
> especially someone using PHP for less than 5-10 years.  But before they
> make me change 20k+ actual lines of code to make it uglier and less
> readable (IMO) they better have some really good, concrete arguments.
> The onus should be on the ones making the sweeping changes.
>
> As an aside, I have a B.Sc in CompSci (around the Java era, though I
> rarely used Java) and took every major related course, learning about 10
> languages.  Not once was I told I had to "initialize my variables",
> and certainly not check for UAI.  Not once did I lose a mark for not
> doing so.  If the language being used required it, then you did it.  If
> it didn't, then all that mattered was having a correct, readable
> program.  Maybe that has changed and all CompSci teachers now mandate
> initializing all variables in all programs in all languages.  However,
> that kind of proves that this choice is rather arbitrary and more a
> function of "style".
>
> I looked for, but do not seem to have access to, the internal PHP
> discussions generated by that RFC, so I cannot know what arguments
> people were using on either side.  I know what my main argument is:
> prove to me it's better, or safer, or any of the other condescending
> descriptors used in favor of banning UVs/UAIs.  I have not been able to
> find a single concrete example of how UVs actually achieve this level of
> devilish behavior that will bring down the entire internet if left as
> notices instead of warnings.  I would love to see some sample code a
> real, half-competent programmer would use that could result in a
> security hole.  I can assure you that not one single program I've
> written in 30 years of script-language usage suffers from a bug caused
> by this.  To bastardize Jerry McGuire: Show Me The Bugs!
>
> I would guess/hope a competent programmer would think like I do (and the
> original creators of PHP clearly did), and at every point they are going
> to use a variable (that they aren't staring at a previous use of on the
> same code page/scope) they say to themselves "this variable might be
> unset".  The base assumption is *always* the variable is unset.
> Further, if you stick with the general paradigm that a value of 0 in
> your program is a negative indication (as is false, null, etc.) then
> unset is just as good!  That is the precise reason one uses an
> untyped/loose-typed language!  The language does the work of following
> its rules to massage $x into the form needed for if(!$x) or ++$x!
>
> I've been told often in situations like this that "not all programmers
> are competent" and so need such "hand holding" to protect themselves.
> I can understand that point, but why should it be *forced* on all of
> the competent programmers?  Why are we *forcing* the lowest common
> denominator?  Why aren't we allowing people and projects that have such
> people or needs to turn on a configuration option (or even make it the
> default)?  Why don't we give those who don't need their hands held the
> option to stick with the multi-decade status quo?  Perl did.  Even
> insane-for-backwards-compatibility python did not institute a change
> between v2 and v3 that required changes to 5-20% of a codebase!  At the
> bottom of the "I know better", "hold their hands" slope lies Logo. That
> is not useful to me.
>
> I have a few personal philosophies.  One is "always forward, never
> back", especially when it comes to computers.  I still use many perl
> scripts I wrote in 1992+ and haven't modified since.  I use many
> others that required a line or two changed once in a while when the OS
> upgraded; in perl, php, python, js, etc.  Same reason I use
> Linux/XFCE/sawfish instead of Windows or GNOME.  If I'm going to spend
> a day coding or configuring, it's going to be to get something new done
> (i.e. make a new program) that will move me forward and build on my
> past efforts, not fighting with some arbitrary change foisted upon me
> that breaks everything in a horrific manner just to get me back to
> where I was yesterday!  Those types of OSs and languages and software I
> expelled from my life ages ago.  When PHP makes UV/UAI a runtime error,
> it will be the first time PHP has broken my rule, out of 23 years.
> (Perl has never broken this rule, out of 30+ years.)
>
> Register_globals makes for an interesting comparison.  It was on by
> default, encouraged, and used everywhere until, what, PHP5 (or 4)?
> That feature has plainly obvious and trivial examples as to why it can
> be a risk and a security hole.  In fact, the thing that made it a risk
> is precisely that it broke the promise that a var you didn't initialize
> was going to be unset.  You had no clue what any random $identifier was
> going to contain, because it was externally controlled.  The fact that
> the feature was killed like a decade ago proves it was a massively
> bigger risk than UV/UAI is now.
>
> So what about the fix/mitigation/pain aspect?  Well, that was easy/quick
> relative to the mess UV/UAI causes.  Just look at the handful of
> get/post (and cookie) vars your page expects (something that can be
> grepped!) and change them to $_GET[], et al, or "init" them yourself at
> the top with $expected=$_GET['expected'].  A handful of lines to edit
> with a very low probability of introducing bugs.  I remember when I had
> to do this to every script I ever wrote and it was quick and easy and
> painless.  So the overall equation of minimizing risk vs the pain of
> the solution was extremely favorable.  Now apply that same calculation
> to UV/UAI.  I challenge you to illustrate for me that the two variables
> of the equation are similar to the register_globals situation.
>
> Likewise, I was appalled, and then relieved when the PHP devs said they
> wanted to get rid of <? ?> short tags but decided against it.  It's
> very similar to the UV/UAI issue because it's a solution in search of a
> problem, something that has been used since day 1, actively encouraged,
> and the risk/pain equation is horrifically lopsided.  And at least that
> problem can be grepped!  UAI really cannot.  That <? ?> came as close
> as it did to being deprecated/errored indicates that there is a real
> problem with the PHP leadership/voters' mindset where many really don't
> care at all about existing codebases or valuable programmer time.
>
> > I disagree with your characterization of the 36 who voted in favour of
> > an error exception as tyrants, though I do agree it would have been a
> > step too far.
> [...]
> > notices as no big deal, and you're probably setting yourself up for
> > more tears by not adapting now.
>
> You agree it's a "step too far" but then hint in the second part what
> we all know is coming: one day (soon) PHP devs will change UV into a
> full blown error.  I bet many are itching to make UAI one too!  Give it
> a few extra years.
>
> However, I won't be crying, because I can, and will, tweak a few lines
> of PHP source code and compile my own (rpm tools on RH-based systems
> make this exceptionally easy).  I've decided UV/UAI is insanity,
> especially if no one can explain how this is even a small security hole
> for *me*, and the path of least pain with equal gain is to maintain my
> own version, perhaps called "sane-php" or "freedom-php".
>
> I know for a fact I won't be the only one.  UV/UAI and <? ?> nonsense is
> precisely what prompts people to flee a project, or fork.  The reason
> there's not an uproar yet is that (some estimates) have PHP8 usage at
> only around 1% at the moment.  Just wait until RHEL ships it by default
> and all the LTS Deb/Ubuntus with PHP7 go EOL.  Heck, even Fedora had to
> delay PHP8 2 or 3 releases vs planned because it caused so much grief.
>
> One more thought: this UV/UAI "fix" could actually cause more security
> holes and less attention to notices/warnings overall because, as of
> now, the easy fix is to disable logging of warnings.  If your site gets
> reams of traffic, you'll practically be forced to go ~E_WARNING,
> otherwise your log disks will fill up and you'll kill your SSD
> lifespan.  That will cause "real" warnings (which I do want to see!) to
> go completely unnoticed.  How is that a better result?  I won't be the
> only one... It'll make dev a disaster because on a dev box, which will
> have display_errors=on, I won't see any of the egregious warnings
> in-page, making bug-free development that much harder.  If the goal is
> to get more attention to warnings/notices, UV/UAI might actually do the
> opposite.
>
> > php > error_reporting(\E_ALL);
> > php > $v = ["ook", "eek"];
> > php > [$ook, $eek, $ack] = $v;
> > PHP Warning:  Undefined array key 2 in php shell code on line 1
> >
> > That notices have historically been hidden does not excuse you from
> > checking your inputs and/or correcting them where needed.
>
> What if the function returns 2 values in some cases, and 3 in others?
> Yes, that might not be a great design choice, but it's a legitimate one
> in that it can be logically correct.  You are punishing the caller with
> a warning for a design choice inside the function (they may not have
> written).  And you've given them no way work around it without a whole
> whack of extra, ugly lines of ?? or isset on a temporary result array
> variable.  All that pain for what gain?  Show me the security hole.
>
> > I'll grant you that dealing with undefined array indices used to be
> > kind of awkward, such that you had to pepper your code with a lot of
> > isset() or array_key_exists() (depending on whether null is an
> > expected value for an array member), but that's nevertheless what you
> > had to do to have robust logic.
>
> Why?  As above, prove it's "robust" vs the alternative.  The RFC and
> random forum/bug discussions I've seen about this do nothing but bandy
> about the condescending descriptors with nary a shred of evidence of
> said "unrobustness".
>
> > > No tyrrany.  Freedom and choice.  Perl, like PHP has never been
> > > strictly
> > > typed or declared.  Perl will never force you to use strict.
> > > Never. So
> > > why is PHP?
> >
> > In my opinion you proceed from a false assumption. As I stated above,
> > accessing an undefined variable has always been an error (at least
> > since PHP 4, if not earlier), albeit a putatively mild one. It is now
> > merely less-mild.
>
> What is the false assumption?  It is irrelevant that they made it a
> "notice" in PHP2, 3, 4 or 5, and a warning in 8.  The main problem,
> which you don't address, is the freedom to choose.  Why can't we have
> that freedom?  Why do the 36 get to choose for the world and why do
> they seem so uniquely oblivious to existing codebases?  I always
> thought python was bad, but at least they admit, that by design, they
> will screw you over every major release and always have!
>
> We are not talking about a change that allows the PHP language
> developers to rip out thousands of C lines of cruft or make the language
> perform 30% faster.  We are talking about, I'm guessing, 1 to 10 lines
> of C code changed and 0% speedup.  All for what?
>
> > Again, I sympathize since you're dealing with an old code-base. I
> > nevertheless believe you've been doing things wrong by treating
> > notices as no big deal, and you're probably setting yourself up for
> > more tears by not adapting now.
>
> I always code perl with -w on, but strict off.  That's the equivalent
> of seeing all notices/warnings in PHP.  In the olden days of PHP I'd log
> E_ALL because only "sane" things were warnings... kind of like perl.  I
> don't know where along the way I ~E_NOTICEd, but over the years PHP has
> made enough non-problems into notices that, yes, I completely ignore
> notices now.  Unlike in perl, they don't seem to ever match up with an
> "oopsy" moment: they simply moan about reams of style choices that I
> disagree with.
>
> I trusted the PHP devs to make good decisions as to what they were
> going to warn or deprecate, and I really have agreed with their choices
> on everything... until now.  Now it feels like they've jumped on the
> "change for change's sake" bandwagon like some other FLOSS projects; or
> worse, the "our style is right and you'll like it" attitude.  But at
> least most of those other projects, if forced to still use them,
> provided some gravy!  There is zero upside *for me* for the PHP UV/UAI
> changes.
>
> Since I have no logic error or security hole in my program because of
> UV/UAI, the only "doing things wrong" I've done is to trust that the PHP
> braintrust will keep making sane choices and/or maintain my freedom to
> choose.
>
> I will end with a lame appeal to authority, but one that is meaningful
> to me as a longtime fan:  Rasmus Lerdorf, inventor of PHP, UV vote "keep
> notice", UAI vote: "keep notice".  Rasmus gets it.  (Oh ya, and Linus
> would never allow a change like this in the Linux Kernel without
> massively obvious and good reasons.  And...)
>
>
> APPENDIX:
>
> Example #0:
>
> function Foo {
>  while ( ... )
>   if (!$i++) {}
> }
>
> Is a perfectly reasonable/concise, if lazy, paradigm for detecting the
> first time in a loop and keeping a count of iterations as a bonus.
> This is now a warning in PHP8.  Is anything at all gained by putting
> "$i=0" before the while, maybe at the top of the page or the start of a
> function?  If I wanted to be forced to do that, I'd be using a language
> that wants me to.  And even if one thinks i=0 should be required, it's
> not universal, as perl happily accepts that without warning even with
> warnings on!  perl sees that and says "the programmer has his big boy
> boots on and does not need coddling".  Even with strict on, as long as
> you declared it with my($i) (even without initializing it!) perl
> doesn't complain.
>
> _______________________________________________
> Roundtable mailing list
> Roundtable at muug.ca
> https://muug.ca/mailman/listinfo/roundtable
>
>
>
>
> --
>
> John Lange
> _______________________________________________
> Roundtable mailing list
> Roundtable at muug.ca
> https://muug.ca/mailman/listinfo/roundtable
>


-- 
John Lange
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://muug.ca/pipermail/roundtable/attachments/20220113/0a644a11/attachment-0001.htm>


More information about the Roundtable mailing list