All, I have seen some use of random number/name generation used in tests recently and want to discuss it further here. I am going to go out on a limb and give the opinion that this is not a good thing, and sets a bad precedent. I think that the real value of tests is that they are an easily repeatable, consistent assessment of where the code is at a certain point in time. Tests really start to lose their value when two runs of the same test on the same code base can result in two different outcomes. I know that for things like name generation this probably won't happen, but as I said it is starting down a slippery slope, and IMHO we should avoid it if possible.
Note that there is nothing wrong with *arbitrary* values, or values determined at random and hard coded into tests - let's just be sure that they are the same on subsequent runs of the same test.
Thoughts?
- Justin
On Wed, 2010-08-25 at 14:08 -0400, Justin Harris wrote:
All, I have seen some use of random number/name generation used in tests recently and want to discuss it further here. I am going to go out on a limb and give the opinion that this is not a good thing, and sets a bad precedent. I think that the real value of tests is that they are an easily repeatable, consistent assessment of where the code is at a certain point in time. Tests really start to lose their value when two runs of the same test on the same code base can result in two different outcomes. I know that for things like name generation this probably won't happen, but as I said it is starting down a slippery slope, and IMHO we should avoid it if possible.
Note that there is nothing wrong with *arbitrary* values, or values determined at random and hard coded into tests - let's just be sure that they are the same on subsequent runs of the same test.
Thoughts?
- Justin
candlepin mailing list candlepin@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/candlepin
We use timestamps like System.currentTimeMillis() in our tests - here's why.
1) It makes tests re-runnable. When tests create "stuff" on the software under test, you often can't rerun the test - because it will fail since the resource name already exists.
2) It allows you to figure out when resources were created just based on the name. Sometimes this is helpful in debugging.
timestamps work better than random because randoms only fulfill #1 above.
Yes, there is a minor repeatability issue with using timestamps (or any calculated value), but I think the value of re-runnability far outweighs the loss here. The whole reason randoms are used in the first place is to avoid these collisions that will cause false positives in the test results.
Jeff
On Wed, Aug 25, 2010 at 3:08 PM, Justin Harris jharris@redhat.com wrote:
All, I have seen some use of random number/name generation used in tests recently and want to discuss it further here. I am going to go out on a limb and give the opinion that this is not a good thing, and sets a bad precedent. I think that the real value of tests is that they are an easily repeatable, consistent assessment of where the code is at a certain point in time. Tests really start to lose their value when two runs of the same test on the same code base can result in two different outcomes. I know that for things like name generation this probably won't happen, but as I said it is starting down a slippery slope, and IMHO we should avoid it if possible.
Note that there is nothing wrong with *arbitrary* values, or values determined at random and hard coded into tests - let's just be sure that they are the same on subsequent runs of the same test.
Thoughts?
- Justin
Any examples where this has or could go wrong? I can't really see the slippery slope here.
I find the random strings preferable to littering the code with hardcoded "bob"'s and "someproducts"'s that we truly don't care about. They just get copy pasted everywhere and often clash at some point.
Cheers,
Devan
R we seeing errors because of this?
--bk
Sent from my iPhone
On Aug 25, 2010, at 7:13 PM, Devan Goodwin dgoodwin@rm-rf.ca wrote:
On Wed, Aug 25, 2010 at 3:08 PM, Justin Harris jharris@redhat.com wrote:
All, I have seen some use of random number/name generation used in tests recently and want to discuss it further here. I am going to go out on a limb and give the opinion that this is not a good thing, and sets a bad precedent. I think that the real value of tests is that they are an easily repeatable, consistent assessment of where the code is at a certain point in time. Tests really start to lose their value when two runs of the same test on the same code base can result in two different outcomes. I know that for things like name generation this probably won't happen, but as I said it is starting down a slippery slope, and IMHO we should avoid it if possible.
Note that there is nothing wrong with *arbitrary* values, or values determined at random and hard coded into tests - let's just be sure that they are the same on subsequent runs of the same test.
Thoughts?
- Justin
Any examples where this has or could go wrong? I can't really see the slippery slope here.
I find the random strings preferable to littering the code with hardcoded "bob"'s and "someproducts"'s that we truly don't care about. They just get copy pasted everywhere and often clash at some point.
Cheers,
Devan
-- Devan Goodwin dgoodwin@rm-rf.ca http://rm-rf.ca _______________________________________________ candlepin mailing list candlepin@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/candlepin
On Thu, 26 Aug 2010, bkearney@redhat.com wrote:
R we seeing errors because of this?
Not at the moment, but I was noticing this in some of our rspecs. It is not an immediate problem but I started to have flashbacks of previous projects I have been on where it was next to impossible to get all tests in the suite to pass because there were random values generated for all kinds of stuff, and different tests would pass/fail on subsequent runs for no real reason except that these values were different. A big value that I see is that we can be very explicit about inputs for the test, and outputs we expect. Often times when calculating inputs, you basically have to recompute what the expected output is, mirroring what is happening in the actual method under test. This IMO is kind of defeating the purpose, because most likely the same person is writing the test that wrote the feature and is baking in whatever assumptions they have about the implementation into the test as well.
For name generation - what if you get a name collision on a 'register' call which results in an error. A rerun would probably not, and the test runs fine. This IMO is not a good thing, because you get two runs of the same test with different outputs. It is true that a static name can also have a collision, but it is consistent and forces us to solve that issue in a more comprehensive way.
The real issue that this is trying to get around, I think, as that we need to get the data in a known state before running a test. This is an important part of test design and I think we should focus on that, making random number generation in tests not necessary.
--bk
Sent from my iPhone
On Aug 25, 2010, at 7:13 PM, Devan Goodwin dgoodwin@rm-rf.ca wrote:
On Wed, Aug 25, 2010 at 3:08 PM, Justin Harris jharris@redhat.com wrote:
All, I have seen some use of random number/name generation used in tests recently and want to discuss it further here. I am going to go out on a limb and give the opinion that this is not a good thing, and sets a bad precedent. I think that the real value of tests is that they are an easily repeatable, consistent assessment of where the code is at a certain point in time. Tests really start to lose their value when two runs of the same test on the same code base can result in two different outcomes. I know that for things like name generation this probably won't happen, but as I said it is starting down a slippery slope, and IMHO we should avoid it if possible.
Note that there is nothing wrong with *arbitrary* values, or values determined at random and hard coded into tests - let's just be sure that they are the same on subsequent runs of the same test.
Thoughts?
- Justin
Any examples where this has or could go wrong? I can't really see the slippery slope here.
I find the random strings preferable to littering the code with hardcoded "bob"'s and "someproducts"'s that we truly don't care about. They just get copy pasted everywhere and often clash at some point.
Cheers,
Devan
-- Devan Goodwin dgoodwin@rm-rf.ca http://rm-rf.ca _______________________________________________ candlepin mailing list candlepin@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/candlepin
candlepin mailing list candlepin@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/candlepin
On Thu, 2010-08-26 at 09:24 -0400, Justin Harris wrote:
On Thu, 26 Aug 2010, bkearney@redhat.com wrote:
R we seeing errors because of this?
Not at the moment, but I was noticing this in some of our rspecs. It is not an immediate problem but I started to have flashbacks of previous projects I have been on where it was next to impossible to get all tests in the suite to pass because there were random values generated for all kinds of stuff, and different tests would pass/fail on subsequent runs for no real reason except that these values were different. A big value that I see is that we can be very explicit about inputs for the test, and outputs we expect. Often times when calculating inputs, you basically have to recompute what the expected output is, mirroring what is happening in the actual method under test. This IMO is kind of defeating the purpose, because most likely the same person is writing the test that wrote the feature and is baking in whatever assumptions they have about the implementation into the test as well.
For name generation - what if you get a name collision on a 'register' call which results in an error. A rerun would probably not, and the test runs fine. This IMO is not a good thing, because you get two runs of the same test with different outputs. It is true that a static name can also have a collision, but it is consistent and forces us to solve that issue in a more comprehensive way.
I do not see how this is possible. How would you get a name collision the first time? The names are supposedly random. The only way you'd get a collision is if the random numbers collide, which won't happen if you choose a sufficiently large random domain. Of course it won't collide EVER if you use timestamps as I suggested :) Well ok it could collide if two people run the same automated test against the same server at the same millisecond.
If you use names like this:
testabc-123452345 testabc-467546734 testabc-546756776
can you suggest a scenario where one name would fail and another would pass? I think that's *incredibly* unlikely. Sure, if you randomize the entire string and use all available ascii (or utf-8) characters, yes you could have inconsistency. But all we're trying to do is prevent collisions. If you use the format above you will prevent collisions AND have consistency.
Jeff
The real issue that this is trying to get around, I think, as that we need to get the data in a known state before running a test. This is an important part of test design and I think we should focus on that, making random number generation in tests not necessary.
--bk
Sent from my iPhone
On Aug 25, 2010, at 7:13 PM, Devan Goodwin dgoodwin@rm-rf.ca wrote:
On Wed, Aug 25, 2010 at 3:08 PM, Justin Harris jharris@redhat.com wrote:
All, I have seen some use of random number/name generation used in tests recently and want to discuss it further here. I am going to go out on a limb and give the opinion that this is not a good thing, and sets a bad precedent. I think that the real value of tests is that they are an easily repeatable, consistent assessment of where the code is at a certain point in time. Tests really start to lose their value when two runs of the same test on the same code base can result in two different outcomes. I know that for things like name generation this probably won't happen, but as I said it is starting down a slippery slope, and IMHO we should avoid it if possible.
Note that there is nothing wrong with *arbitrary* values, or values determined at random and hard coded into tests - let's just be sure that they are the same on subsequent runs of the same test.
Thoughts?
- Justin
Any examples where this has or could go wrong? I can't really see the slippery slope here.
I find the random strings preferable to littering the code with hardcoded "bob"'s and "someproducts"'s that we truly don't care about. They just get copy pasted everywhere and often clash at some point.
Cheers,
Devan
-- Devan Goodwin dgoodwin@rm-rf.ca http://rm-rf.ca _______________________________________________ candlepin mailing list candlepin@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/candlepin
candlepin mailing list candlepin@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/candlepin
candlepin mailing list candlepin@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/candlepin
On Thu, 26 Aug 2010, Jeff Weiss wrote:
On Thu, 2010-08-26 at 09:24 -0400, Justin Harris wrote:
On Thu, 26 Aug 2010, bkearney@redhat.com wrote:
R we seeing errors because of this?
Not at the moment, but I was noticing this in some of our rspecs. It is not an immediate problem but I started to have flashbacks of previous projects I have been on where it was next to impossible to get all tests in the suite to pass because there were random values generated for all kinds of stuff, and different tests would pass/fail on subsequent runs for no real reason except that these values were different. A big value that I see is that we can be very explicit about inputs for the test, and outputs we expect. Often times when calculating inputs, you basically have to recompute what the expected output is, mirroring what is happening in the actual method under test. This IMO is kind of defeating the purpose, because most likely the same person is writing the test that wrote the feature and is baking in whatever assumptions they have about the implementation into the test as well.
For name generation - what if you get a name collision on a 'register' call which results in an error. A rerun would probably not, and the test runs fine. This IMO is not a good thing, because you get two runs of the same test with different outputs. It is true that a static name can also have a collision, but it is consistent and forces us to solve that issue in a more comprehensive way.
I do not see how this is possible. How would you get a name collision the first time? The names are supposedly random. The only way you'd
Well, you wouldn't. There is nothing to collide with. But my only point is that if you are running against a known data set then you should know what name you can use - and avoid using random names altogether. If the known data set is an empty db (hence no name collision the first time), then why not just use a static name?
get a collision is if the random numbers collide, which won't happen if you choose a sufficiently large random domain. Of course it won't collide EVER if you use timestamps as I suggested :) Well ok it could collide if two people run the same automated test against the same server at the same millisecond.
A valid point - please keep in mind this want meant to be a suggestion only for our (dev) functional tests, in which IMO we can have full control over the database to put it in a known state. For QE's runs dealing with Oracle and the broader infrastructure, I fully understand that this is not feasible and I am not trying to tell QE how to do thier job. :)
I think this was meant more to be a theoretical discussion about the general notion of using randomly generated values in tests - and me throwing it out there to see what everyone thought (clearly everyone thinks I am wrong).
This is turning into the cucumber debacle of '10 all over again.
If you use names like this:
testabc-123452345 testabc-467546734 testabc-546756776
can you suggest a scenario where one name would fail and another would pass? I think that's *incredibly* unlikely. Sure, if you randomize the entire string and use all available ascii (or utf-8) characters, yes you could have inconsistency. But all we're trying to do is prevent collisions. If you use the format above you will prevent collisions AND have consistency.
Jeff
The real issue that this is trying to get around, I think, as that we need to get the data in a known state before running a test. This is an important part of test design and I think we should focus on that, making random number generation in tests not necessary.
--bk
Sent from my iPhone
On Aug 25, 2010, at 7:13 PM, Devan Goodwin dgoodwin@rm-rf.ca wrote:
On Wed, Aug 25, 2010 at 3:08 PM, Justin Harris jharris@redhat.com wrote:
All, I have seen some use of random number/name generation used in tests recently and want to discuss it further here. I am going to go out on a limb and give the opinion that this is not a good thing, and sets a bad precedent. I think that the real value of tests is that they are an easily repeatable, consistent assessment of where the code is at a certain point in time. Tests really start to lose their value when two runs of the same test on the same code base can result in two different outcomes. I know that for things like name generation this probably won't happen, but as I said it is starting down a slippery slope, and IMHO we should avoid it if possible.
Note that there is nothing wrong with *arbitrary* values, or values determined at random and hard coded into tests - let's just be sure that they are the same on subsequent runs of the same test.
Thoughts?
- Justin
Any examples where this has or could go wrong? I can't really see the slippery slope here.
I find the random strings preferable to littering the code with hardcoded "bob"'s and "someproducts"'s that we truly don't care about. They just get copy pasted everywhere and often clash at some point.
Cheers,
Devan
-- Devan Goodwin dgoodwin@rm-rf.ca http://rm-rf.ca _______________________________________________ candlepin mailing list candlepin@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/candlepin
candlepin mailing list candlepin@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/candlepin
candlepin mailing list candlepin@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/candlepin
candlepin mailing list candlepin@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/candlepin
On Thu, 2010-08-26 at 09:52 -0400, Justin Harris wrote:
On Thu, 26 Aug 2010, Jeff Weiss wrote:
On Thu, 2010-08-26 at 09:24 -0400, Justin Harris wrote:
On Thu, 26 Aug 2010, bkearney@redhat.com wrote:
R we seeing errors because of this?
Not at the moment, but I was noticing this in some of our rspecs. It is not an immediate problem but I started to have flashbacks of previous projects I have been on where it was next to impossible to get all tests in the suite to pass because there were random values generated for all kinds of stuff, and different tests would pass/fail on subsequent runs for no real reason except that these values were different. A big value that I see is that we can be very explicit about inputs for the test, and outputs we expect. Often times when calculating inputs, you basically have to recompute what the expected output is, mirroring what is happening in the actual method under test. This IMO is kind of defeating the purpose, because most likely the same person is writing the test that wrote the feature and is baking in whatever assumptions they have about the implementation into the test as well.
For name generation - what if you get a name collision on a 'register' call which results in an error. A rerun would probably not, and the test runs fine. This IMO is not a good thing, because you get two runs of the same test with different outputs. It is true that a static name can also have a collision, but it is consistent and forces us to solve that issue in a more comprehensive way.
I do not see how this is possible. How would you get a name collision the first time? The names are supposedly random. The only way you'd
Well, you wouldn't. There is nothing to collide with. But my only point is that if you are running against a known data set then you should know what name you can use - and avoid using random names altogether. If the known data set is an empty db (hence no name collision the first time), then why not just use a static name?
get a collision is if the random numbers collide, which won't happen if you choose a sufficiently large random domain. Of course it won't collide EVER if you use timestamps as I suggested :) Well ok it could collide if two people run the same automated test against the same server at the same millisecond.
A valid point - please keep in mind this want meant to be a suggestion only for our (dev) functional tests, in which IMO we can have full control over the database to put it in a known state. For QE's runs dealing with Oracle and the broader infrastructure, I fully understand that this is not feasible and I am not trying to tell QE how to do thier job. :)
OK - if rerunning tests isn't important then you're right - there is no benefit to using random (or calculated) values - they should be static. (and by rerunning I mean running tests more than once without resetting the database).
I think this was meant more to be a theoretical discussion about the general notion of using randomly generated values in tests - and me throwing it out there to see what everyone thought (clearly everyone thinks I am wrong).
This is turning into the cucumber debacle of '10 all over again.
If you use names like this:
testabc-123452345 testabc-467546734 testabc-546756776
can you suggest a scenario where one name would fail and another would pass? I think that's *incredibly* unlikely. Sure, if you randomize the entire string and use all available ascii (or utf-8) characters, yes you could have inconsistency. But all we're trying to do is prevent collisions. If you use the format above you will prevent collisions AND have consistency.
Jeff
The real issue that this is trying to get around, I think, as that we need to get the data in a known state before running a test. This is an important part of test design and I think we should focus on that, making random number generation in tests not necessary.
--bk
Sent from my iPhone
On Aug 25, 2010, at 7:13 PM, Devan Goodwin dgoodwin@rm-rf.ca wrote:
On Wed, Aug 25, 2010 at 3:08 PM, Justin Harris jharris@redhat.com wrote:
All, I have seen some use of random number/name generation used in tests recently and want to discuss it further here. I am going to go out on a limb and give the opinion that this is not a good thing, and sets a bad precedent. I think that the real value of tests is that they are an easily repeatable, consistent assessment of where the code is at a certain point in time. Tests really start to lose their value when two runs of the same test on the same code base can result in two different outcomes. I know that for things like name generation this probably won't happen, but as I said it is starting down a slippery slope, and IMHO we should avoid it if possible.
Note that there is nothing wrong with *arbitrary* values, or values determined at random and hard coded into tests - let's just be sure that they are the same on subsequent runs of the same test.
Thoughts?
- Justin
Any examples where this has or could go wrong? I can't really see the slippery slope here.
I find the random strings preferable to littering the code with hardcoded "bob"'s and "someproducts"'s that we truly don't care about. They just get copy pasted everywhere and often clash at some point.
Cheers,
Devan
-- Devan Goodwin dgoodwin@rm-rf.ca http://rm-rf.ca _______________________________________________ candlepin mailing list candlepin@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/candlepin
candlepin mailing list candlepin@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/candlepin
candlepin mailing list candlepin@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/candlepin
candlepin mailing list candlepin@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/candlepin
On Thu, Aug 26, 2010 at 10:24 AM, Justin Harris jharris@redhat.com wrote:
On Thu, 26 Aug 2010, bkearney@redhat.com wrote:
R we seeing errors because of this?
Not at the moment, but I was noticing this in some of our rspecs. It is not an immediate problem but I started to have flashbacks of previous projects I have been on where it was next to impossible to get all tests in the suite to pass because there were random values generated for all kinds of stuff, and different tests would pass/fail on subsequent runs for no real reason except that these values were different. A big value that I see is that we can be very explicit about inputs for the test, and outputs we expect. Often times when calculating inputs, you basically have to recompute what the expected output is, mirroring what is happening in the actual method under test. This IMO is kind of defeating the purpose, because most likely the same person is writing the test that wrote the feature and is baking in whatever assumptions they have about the implementation into the test as well.
For name generation - what if you get a name collision on a 'register' call which results in an error. A rerun would probably not, and the test runs fine. This IMO is not a good thing, because you get two runs of the same test with different outputs. It is true that a static name can also have a collision, but it is consistent and forces us to solve that issue in a more comprehensive way.
Static strings are more likely to cause those collisions, and a redeploy will often wipe the db and your error would disappear, it seems like the same scenario to me, two runs, different results.
I agree to a point, certainly common sense needs to be used, we shouldn't use a random integer for the quantity of a sub/pool for instance. But for object names (consumers/products/owners), I genuinely cannot think of how it would be a problem aside from a collision, which is less likely than with statics. I would also definitely agree we should never use a random value directly related to something we're testing, like what characters are suitable for an owner name. :)
Static strings get copied everywhere (which is just bad), and coming up with a unique owner name every time we create one is kind of irritating. My pref would be to let the test util methods stub out random names for products/consumers/owners.
That said, day to day, this issue is a far cry from a big one for me, so I'm cool with giving everything an explicit name if that's the route we wanna go.
Cheers,
Devan
candlepin@lists.stg.fedorahosted.org