[RndTbl] PHP undefined vars / array indices

John Lange john at johnlange.ca
Thu Jan 13 14:17:36 CST 2022


As a former professional PHP programmer and current hobbyist programmer
(not in PHP though), I agree with Trevor. (disclaimer: I did not go back
and re-read all the PHP threads on this topic).

PHP made a fundamental change to the way the language works which breaks
backwards compatibility and has not provided any concrete evidence that
supports the published justification for this change. "style" and "best
practice" really are just opinions.

I also don't see any reason why PHP could not have defined a
global "use_strict= true/false" parameter similar to the approach that Perl
took years back. The default could even be "true", if they want to
emphasize the importance of it.

However, I do agree that it's not good programming practice. Consider this
example:

function Foo {
 while ( $i < 5 )
  if (!$i++) {}
  // ... (a whole bunch more lines of code go here) ..

 while ( $i < 5 ) // inadvertently using the same variable because $i is
your favorite 'counter' and you forgot you already used it
  if (!$i++) {  } // This line never runs
}

On a side note, you would only ever see "!$i++" syntax in non-declarative
languages like PHP. It makes no sense otherwise since integers can never be
false. Aside from that, my personal preference is to code for readability
and I find the statement is hard to interpret. So I prefer:

If ($i == false) {
  $i++
  ... other stuff
}

But never the less the point is I agree that PHP should not have broken
backward compatibility. By doing so it will force many sites to remain on
PHP 7.x thereby opening up the very real possibility that a 7.x security
vulnerability will get exploited and cause mass-grief (log4j anyone?).

John


On Thu, Jan 13, 2022 at 3:04 AM Trevor Cordes <trevor at tecnopolis.ca> wrote:

> On 2022-01-10 J. King wrote:
> > I sympathize with your plight as you're dealing with an older code-
>
> Thank you for your constructive and well-reasoned reply.  It is nice to
> have another person to discuss this important issue with.  (I apologize
> in advance for the length of this reply.)
>
> I think it's helpful to dissect the issue into three parts: the
> risk posed by the original status quo, and the pain of change, and the
> freedom of the programmer.  In advance, I will readily grant that the
> uninitialized value (UV) problem is more often a possible "mistake" than
> the uninitialized array index (UAI) one.  I would also guess it is much
> more rare.  At least that is what my attempts to "fix" my code has
> revealed.  But everything I discuss applies pretty much equally to both.
> And the PHP8 RFC, though it separated the issues, ended up lumping them
> into the same result: turning them both into warnings.  Thus it follows
> that the arguments used to vilify one (UV) must closely or somewhat
> apply to the other (UAI).
>
> I've been using PHP (in production) since 1999 and v3.  I'm nearly
> positive UV and UAIs were not even a notice back then.  In fact, with
> the (yes, evil) register_globals they got rid of a long time ago (more
> below), the whole idea of knowing if a var was initialized or not was
> impossible (without jumping through the isset hoops which almost nobody
> did, especially since ?? didn't yet exist).
>
> So let's establish that PHP from v1 through v3 not only didn't care, it
> kind of mandated the programmer not check for uninitialized variables
> (UV).  In other words, it not only said it was "ok", it said "this is
> the way it's done".  I can go into any of my ancient PHP3 books (from
> many various publishers) and guarantee you I will not find one example
> that bothers to initialize a variable when there was no reason to.
> (See example #0 at bottom.)
>
> And that's ok, because many other loose, untyped, scripting languages
> that were popular around 1999 were exactly the same way (e.g. perl).  In
> fact, I would posit that they were designed this way!  It wasn't an
> oversight or laziness: it was the desire to have the programming
> language do more of the work for you, and to reduce the code line count
> (vs C) required to get a job done (hence why the scripting languages
> were considered "rapid" and (often) "prototype").
>
> Were the developers of PHP in 1999 ignorant?  Or were they trying to
> create a language suited for a purpose in a way that was similar to its
> peers and required "less work" than the heavy alternatives?
>
> So what's really the problem here?  All I hear from the "pro" side is
> "times have changed", "it's not good practice", "it's not robust",
> "legacy", "possible logic errors", "doing things wrong".  And they are
> saying the original makers of PHP, all the book authors from back then,
> and a generation of programmers who used it that way were/are wrong and
> must change (certainly by the time the "warning" is turned into a
> runtime error).  OK, someone could have that opinion, I get it,
> especially someone using PHP for less than 5-10 years.  But before they
> make me change 20k+ actual lines of code to make it uglier and less
> readable (IMO) they better have some really good, concrete arguments.
> The onus should be on the ones making the sweeping changes.
>
> As an aside, I have a B.Sc in CompSci (around the Java era, though I
> rarely used Java) and took every major related course, learning about 10
> languages.  Not once was I told I had to "initialize my variables",
> and certainly not check for UAI.  Not once did I lose a mark for not
> doing so.  If the language being used required it, then you did it.  If
> it didn't, then all that mattered was having a correct, readable
> program.  Maybe that has changed and all CompSci teachers now mandate
> initializing all variables in all programs in all languages.  However,
> that kind of proves that this choice is rather arbitrary and more a
> function of "style".
>
> I looked for, but do not seem to have access to, the internal PHP
> discussions generated by that RFC, so I cannot know what arguments
> people were using on either side.  I know what my main argument is:
> prove to me it's better, or safer, or any of the other condescending
> descriptors used in favor of banning UVs/UAIs.  I have not been able to
> find a single concrete example of how UVs actually achieve this level of
> devilish behavior that will bring down the entire internet if left as
> notices instead of warnings.  I would love to see some sample code a
> real, half-competent programmer would use that could result in a
> security hole.  I can assure you that not one single program I've
> written in 30 years of script-language usage suffers from a bug caused
> by this.  To bastardize Jerry McGuire: Show Me The Bugs!
>
> I would guess/hope a competent programmer would think like I do (and the
> original creators of PHP clearly did), and at every point they are going
> to use a variable (that they aren't staring at a previous use of on the
> same code page/scope) they say to themselves "this variable might be
> unset".  The base assumption is *always* the variable is unset.
> Further, if you stick with the general paradigm that a value of 0 in
> your program is a negative indication (as is false, null, etc.) then
> unset is just as good!  That is the precise reason one uses an
> untyped/loose-typed language!  The language does the work of following
> its rules to massage $x into the form needed for if(!$x) or ++$x!
>
> I've been told often in situations like this that "not all programmers
> are competent" and so need such "hand holding" to protect themselves.
> I can understand that point, but why should it be *forced* on all of
> the competent programmers?  Why are we *forcing* the lowest common
> denominator?  Why aren't we allowing people and projects that have such
> people or needs to turn on a configuration option (or even make it the
> default)?  Why don't we give those who don't need their hands held the
> option to stick with the multi-decade status quo?  Perl did.  Even
> insane-for-backwards-compatibility python did not institute a change
> between v2 and v3 that required changes to 5-20% of a codebase!  At the
> bottom of the "I know better", "hold their hands" slope lies Logo. That
> is not useful to me.
>
> I have a few personal philosophies.  One is "always forward, never
> back", especially when it comes to computers.  I still use many perl
> scripts I wrote in 1992+ and haven't modified since.  I use many
> others that required a line or two changed once in a while when the OS
> upgraded; in perl, php, python, js, etc.  Same reason I use
> Linux/XFCE/sawfish instead of Windows or GNOME.  If I'm going to spend
> a day coding or configuring, it's going to be to get something new done
> (i.e. make a new program) that will move me forward and build on my
> past efforts, not fighting with some arbitrary change foisted upon me
> that breaks everything in a horrific manner just to get me back to
> where I was yesterday!  Those types of OSs and languages and software I
> expelled from my life ages ago.  When PHP makes UV/UAI a runtime error,
> it will be the first time PHP has broken my rule, out of 23 years.
> (Perl has never broken this rule, out of 30+ years.)
>
> Register_globals makes for an interesting comparison.  It was on by
> default, encouraged, and used everywhere until, what, PHP5 (or 4)?
> That feature has plainly obvious and trivial examples as to why it can
> be a risk and a security hole.  In fact, the thing that made it a risk
> is precisely that it broke the promise that a var you didn't initialize
> was going to be unset.  You had no clue what any random $identifier was
> going to contain, because it was externally controlled.  The fact that
> the feature was killed like a decade ago proves it was a massively
> bigger risk than UV/UAI is now.
>
> So what about the fix/mitigation/pain aspect?  Well, that was easy/quick
> relative to the mess UV/UAI causes.  Just look at the handful of
> get/post (and cookie) vars your page expects (something that can be
> grepped!) and change them to $_GET[], et al, or "init" them yourself at
> the top with $expected=$_GET['expected'].  A handful of lines to edit
> with a very low probability of introducing bugs.  I remember when I had
> to do this to every script I ever wrote and it was quick and easy and
> painless.  So the overall equation of minimizing risk vs the pain of
> the solution was extremely favorable.  Now apply that same calculation
> to UV/UAI.  I challenge you to illustrate for me that the two variables
> of the equation are similar to the register_globals situation.
>
> Likewise, I was appalled, and then relieved when the PHP devs said they
> wanted to get rid of <? ?> short tags but decided against it.  It's
> very similar to the UV/UAI issue because it's a solution in search of a
> problem, something that has been used since day 1, actively encouraged,
> and the risk/pain equation is horrifically lopsided.  And at least that
> problem can be grepped!  UAI really cannot.  That <? ?> came as close
> as it did to being deprecated/errored indicates that there is a real
> problem with the PHP leadership/voters' mindset where many really don't
> care at all about existing codebases or valuable programmer time.
>
> > I disagree with your characterization of the 36 who voted in favour of
> > an error exception as tyrants, though I do agree it would have been a
> > step too far.
> [...]
> > notices as no big deal, and you're probably setting yourself up for
> > more tears by not adapting now.
>
> You agree it's a "step too far" but then hint in the second part what
> we all know is coming: one day (soon) PHP devs will change UV into a
> full blown error.  I bet many are itching to make UAI one too!  Give it
> a few extra years.
>
> However, I won't be crying, because I can, and will, tweak a few lines
> of PHP source code and compile my own (rpm tools on RH-based systems
> make this exceptionally easy).  I've decided UV/UAI is insanity,
> especially if no one can explain how this is even a small security hole
> for *me*, and the path of least pain with equal gain is to maintain my
> own version, perhaps called "sane-php" or "freedom-php".
>
> I know for a fact I won't be the only one.  UV/UAI and <? ?> nonsense is
> precisely what prompts people to flee a project, or fork.  The reason
> there's not an uproar yet is that (some estimates) have PHP8 usage at
> only around 1% at the moment.  Just wait until RHEL ships it by default
> and all the LTS Deb/Ubuntus with PHP7 go EOL.  Heck, even Fedora had to
> delay PHP8 2 or 3 releases vs planned because it caused so much grief.
>
> One more thought: this UV/UAI "fix" could actually cause more security
> holes and less attention to notices/warnings overall because, as of
> now, the easy fix is to disable logging of warnings.  If your site gets
> reams of traffic, you'll practically be forced to go ~E_WARNING,
> otherwise your log disks will fill up and you'll kill your SSD
> lifespan.  That will cause "real" warnings (which I do want to see!) to
> go completely unnoticed.  How is that a better result?  I won't be the
> only one... It'll make dev a disaster because on a dev box, which will
> have display_errors=on, I won't see any of the egregious warnings
> in-page, making bug-free development that much harder.  If the goal is
> to get more attention to warnings/notices, UV/UAI might actually do the
> opposite.
>
> > php > error_reporting(\E_ALL);
> > php > $v = ["ook", "eek"];
> > php > [$ook, $eek, $ack] = $v;
> > PHP Warning:  Undefined array key 2 in php shell code on line 1
> >
> > That notices have historically been hidden does not excuse you from
> > checking your inputs and/or correcting them where needed.
>
> What if the function returns 2 values in some cases, and 3 in others?
> Yes, that might not be a great design choice, but it's a legitimate one
> in that it can be logically correct.  You are punishing the caller with
> a warning for a design choice inside the function (they may not have
> written).  And you've given them no way work around it without a whole
> whack of extra, ugly lines of ?? or isset on a temporary result array
> variable.  All that pain for what gain?  Show me the security hole.
>
> > I'll grant you that dealing with undefined array indices used to be
> > kind of awkward, such that you had to pepper your code with a lot of
> > isset() or array_key_exists() (depending on whether null is an
> > expected value for an array member), but that's nevertheless what you
> > had to do to have robust logic.
>
> Why?  As above, prove it's "robust" vs the alternative.  The RFC and
> random forum/bug discussions I've seen about this do nothing but bandy
> about the condescending descriptors with nary a shred of evidence of
> said "unrobustness".
>
> > > No tyrrany.  Freedom and choice.  Perl, like PHP has never been
> > > strictly
> > > typed or declared.  Perl will never force you to use strict.
> > > Never. So
> > > why is PHP?
> >
> > In my opinion you proceed from a false assumption. As I stated above,
> > accessing an undefined variable has always been an error (at least
> > since PHP 4, if not earlier), albeit a putatively mild one. It is now
> > merely less-mild.
>
> What is the false assumption?  It is irrelevant that they made it a
> "notice" in PHP2, 3, 4 or 5, and a warning in 8.  The main problem,
> which you don't address, is the freedom to choose.  Why can't we have
> that freedom?  Why do the 36 get to choose for the world and why do
> they seem so uniquely oblivious to existing codebases?  I always
> thought python was bad, but at least they admit, that by design, they
> will screw you over every major release and always have!
>
> We are not talking about a change that allows the PHP language
> developers to rip out thousands of C lines of cruft or make the language
> perform 30% faster.  We are talking about, I'm guessing, 1 to 10 lines
> of C code changed and 0% speedup.  All for what?
>
> > Again, I sympathize since you're dealing with an old code-base. I
> > nevertheless believe you've been doing things wrong by treating
> > notices as no big deal, and you're probably setting yourself up for
> > more tears by not adapting now.
>
> I always code perl with -w on, but strict off.  That's the equivalent
> of seeing all notices/warnings in PHP.  In the olden days of PHP I'd log
> E_ALL because only "sane" things were warnings... kind of like perl.  I
> don't know where along the way I ~E_NOTICEd, but over the years PHP has
> made enough non-problems into notices that, yes, I completely ignore
> notices now.  Unlike in perl, they don't seem to ever match up with an
> "oopsy" moment: they simply moan about reams of style choices that I
> disagree with.
>
> I trusted the PHP devs to make good decisions as to what they were
> going to warn or deprecate, and I really have agreed with their choices
> on everything... until now.  Now it feels like they've jumped on the
> "change for change's sake" bandwagon like some other FLOSS projects; or
> worse, the "our style is right and you'll like it" attitude.  But at
> least most of those other projects, if forced to still use them,
> provided some gravy!  There is zero upside *for me* for the PHP UV/UAI
> changes.
>
> Since I have no logic error or security hole in my program because of
> UV/UAI, the only "doing things wrong" I've done is to trust that the PHP
> braintrust will keep making sane choices and/or maintain my freedom to
> choose.
>
> I will end with a lame appeal to authority, but one that is meaningful
> to me as a longtime fan:  Rasmus Lerdorf, inventor of PHP, UV vote "keep
> notice", UAI vote: "keep notice".  Rasmus gets it.  (Oh ya, and Linus
> would never allow a change like this in the Linux Kernel without
> massively obvious and good reasons.  And...)
>
>
> APPENDIX:
>
> Example #0:
>
> function Foo {
>  while ( ... )
>   if (!$i++) {}
> }
>
> Is a perfectly reasonable/concise, if lazy, paradigm for detecting the
> first time in a loop and keeping a count of iterations as a bonus.
> This is now a warning in PHP8.  Is anything at all gained by putting
> "$i=0" before the while, maybe at the top of the page or the start of a
> function?  If I wanted to be forced to do that, I'd be using a language
> that wants me to.  And even if one thinks i=0 should be required, it's
> not universal, as perl happily accepts that without warning even with
> warnings on!  perl sees that and says "the programmer has his big boy
> boots on and does not need coddling".  Even with strict on, as long as
> you declared it with my($i) (even without initializing it!) perl
> doesn't complain.
>
> _______________________________________________
> Roundtable mailing list
> Roundtable at muug.ca
> https://muug.ca/mailman/listinfo/roundtable
>


-- 
John Lange
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://muug.ca/pipermail/roundtable/attachments/20220113/8fe6317e/attachment-0001.htm>


More information about the Roundtable mailing list