True Maximum Cache Density

June 4, 2014

Hello everyone! I have been work on a graduate project about Geocaching and I was hoping to get your thoughts on some things.

I want to know what the maximum geocache density would be in a "saturated" area. I have searched the forums for this and found the posts talking about "hexagonal packing" and those were very useful.

But what I am hoping to figure out is what is the true maximum when you take into consideration all of the space taken up by buildings and roads and also when you take into account that people do not hide their caches according to some hexagonal packing rule.

For example, in my findings the COUNTY of New York has a geocache density of 8.6 geocache per sq. mile. Is this high enough to consider the county saturated when you take into account all of the space taken up by streets, buildings, etc??

I found on the NY planning website a break down of land use. It showed what percentage of the county was taken up by buildings streets, etc. Using only the remaining land area, the county had a geocache density more like 54 geocaches per sq. mile.

My problem is I do not have that sort of information for all of my data. Sooooo I need to find a number that represents a true maximum density. Below is some of the data that I have just to give you an idea...

State County # of Geocaches Area (sq. miles) Geocache Density

Virginia Fredericksburg City 158 11 15.019

Virginia Falls Church City 21 2 10.500

Nevada Clark 6,872 737.67 9.316

New York New York 290 33.68 8.610

Virginia Alexandria City 129 15 8.388

Minnesota Ramsey 1,399 170.16 8.222

Colorado Broomfield 269 33 8.152

Virginia Arlington 197 26 7.586

Illinois Du Page 2,381 336.87 7.068

Oregon Multnomah 3,060 465.7 6.571

Colorado Denver 949 154.88 6.127

California Orange 5,567 947.91 5.873

Minnesota Hennepin 3,544 606.43 5.844

Kentucky Jefferson 2,305 398.6 5.783

New Jersey Union 497 105.46 4.713

Alabama Baldwin 9,335 2027.08 4.605

Indiana Marion 1,839 403.11 4.562

Texas Tarrant 4,038 897.56 4.499

California Contra Costa 3,591 802.18 4.477

Virginia Danville City 194 44 4.415

Illinois Cook 7,121 1634.89 4.356

New Jersey Passaic 841 197.07 4.268

Ohio Hamilton 1,752 412.81 4.244

Michigan Macomb 2,349 569.81 4.122

Utah Salt Lake 3,318 807.83 4.107

Pennsylvania Philadelphia 572 142.68 4.009

Minnesota Anoka 1,748 446.28 3.917

California Alameda 3,156 821.26 3.843

Virginia Lynchburg City 191 50 3.838

Wisconsin Washington 1,648 435.92 3.781

Texas Dallas 3,429 908.87 3.773

Michigan Wayne 2,516 672.26 3.743

Oregon Washington 2,641 726.43 3.636

California Santa Clara 4,704 1304.53 3.606

Rhode Island Bristol 159 44.71 3.556

New Jersey Essex 460 129.59 3.550

New Jersey Morris 1,707 481.36 3.546

California Sacramento 3,512 995.65 3.527

Texas Harris 6,246 1777.89 3.513

Michigan Oakland 3,173 908.07 3.494

Ohio Montgomery 1,595 464.39 3.435

Virginia Richmond City 215 63 3.434

New Jersey Camden 778 227.59 3.418

Nebraska Douglas 1,159 339.65 3.412

California Ventura 7,514 2208.36 3.403

Rhode Island Kent 626 188.01 3.330

California San Diego 14,766 4525.92 3.263

North Carolina Wake 2,788 857.5 3.251

Texas Collin 2,872 885.91 3.242

Pennsylvania Allegheny 2,414 744.73 3.241

Ohio Franklin 1,750 543.35 3.221

New Mexico Bernalillo 3,661 1168.77 3.132

Colorado Jefferson 2,415 778.2 3.103

Virginia Fairfax 1,250 407 3.073

Delaware New Castle 1,513 493.53 3.066

Texas Denton 2,772 910.55 3.044

Iowa Polk 1,787 591.94 3.019

Pennsylvania Montgomery 1,423 487.47 2.919

Minnesota Washington 1,228 423.19 2.902

North Carolina Guilford 1,902 657.73 2.892

Pennsylvania Bucks 1,774 622.15 2.851

Georgia Cobb 974 344.54 2.827

Indiana Vanderburgh 666 235.76 2.825

California San Francisco 655 231.89 2.825

Tennessee Sullivan 1,204 429.71 2.802

District of Columbia 191 68.36 2.794

California San Mateo 2,065 741.07 2.787

Oklahoma Oklahoma 1,995 718.38 2.777

Illinois Kane 1,431 524.12 2.730

California Los Angeles 12,846 4752.32 2.703

Georgia Fayette 538 199.27 2.700

Washington Clark 1,743 656.26 2.656

Ohio Summit 1,086 420.1 2.585

Colorado Arapahoe 2,080 805.47 2.582

New Hampshire Rockingham 2,011 794.04 2.533

New Jersey Burlington 2,062 819.48 2.516

Kentucky Franklin 531 212.13 2.503

Minnesota Dakota 1,463 586.36 2.495

2.4k · June 4, 2014

I think that there are too many factors to come up with a valid conclusion. For example, the buildings could have caches placed on their roofs (public viewing areas) or inside somewhere (the final of a multi, such as library caches). There could be nanos on parking meters. So even in cities where there isn't much green space, the density is really only determined by permission from the land manager.

3.3k · June 4, 2014

I think that there are too many factors to come up with a valid conclusion. For example, the buildings could have caches placed on their roofs (public viewing areas) or inside somewhere (the final of a multi, such as library caches). There could be nanos on parking meters. So even in cities where there isn't much green space, the density is really only determined by permission from the land manager.

Yep, in urban areas you'd be amazed how many caches there are on the streets.

OP - take a look at London (England) to see how densely they can be packed in, in urban areas. If you've got road signs you've got places to hide film pots and nanos. If you've got railings you've got places to hide nanos. If you've got even a small park you've got space to hide a couple of micros.

Whether the caches hidden are worth finding is a matter of opinion, but you can certainly pack a load of them in there if you're not too worried about how interesting they are.

Edited June 4, 2014 by team tisri

June 4, 2014

Some quick thoughts on this: I'm assuming you're doing this as a computing exercise.

Ignoring the excluded-area issue for a minute, you should be able to get a reasonable idea through a simple simulation. Treat your region as a 10 mile by 10 mile square, say. then let your "geocachers" pick a random spot in the square and place a cache there if proximity allows, until you run out of space. You need to allow for edge effects - I'd do this by wrapping around (i.e., a cache at (9.999,2.3) would be treated as within 0.1 mile of one at (0.001, 2.31)).

You could try a more complicated simulation which takes into effect that cachers like to place series at just over the 0.1 limit, so favour placements just outside existing proximity zones.

For an urban area, I would expect that the presence of buildings and streets wouldn't affect things much. This assumes you can place a cache anywhere along a street you like (on a light pole, for instance). Streets are sufficiently dense that wherever you choose to put a cache, you can probably move it less than 100ft onto the nearest street and not affect proximity in any material way.

The same applies for wilderness, where you can again place a cache more or less anywhere. The interesting case is cultivated rural areas, where most caches are going to be placed along roads and footpaths, but there aren't too many of either, so this can restrict the number of cache placements considerably. Setting up a model for this doesn't feel too hard to me.

3.1k · June 4, 2014

What's the minimum area you allow for? If it's a square inch, then I know of a hidden nano cache that has a density of 4,014,489,600 caches per square mile.

June 4, 2014

There are too many unknowns to give an an accurate answer. Yes, a certain portion of land will be given over to buildings and other places you can't put a cache, but how is that parcelled up and distributed amongst the land? Is it in a large contiguous block, or in small parcels wih gaps in between (much like blocks and roads in the grid style)? No two urban areas are the same and of course outside urban areas it's another story entirely.

I did put together a puzzle cache that I never submitted along these lines based on the theoretical maximum, ignoring what we'll call "real-world" considerations, and it's along the lines of the hexagonal packing method alluded to above. You can get over 118 caches per square mile that way, but that's based on being able to hide a cache absolutely everywhere, and having a clean slate to start with and hiding every single cache in the optimum arrangement.

That said, I do like crb11's algorithm for a more real-world solution, as that involves each cache being hidden in and around the existing caches

June 4, 2014

Some quick thoughts on this: I'm assuming you're doing this as a computing exercise.

Ignoring the excluded-area issue for a minute, you should be able to get a reasonable idea through a simple simulation. Treat your region as a 10 mile by 10 mile square, say. then let your "geocachers" pick a random spot in the square and place a cache there if proximity allows, until you run out of space. You need to allow for edge effects - I'd do this by wrapping around (i.e., a cache at (9.999,2.3) would be treated as within 0.1 mile of one at (0.001, 2.31)).

You could try a more complicated simulation which takes into effect that cachers like to place series at just over the 0.1 limit, so favour placements just outside existing proximity zones.

For an urban area, I would expect that the presence of buildings and streets wouldn't affect things much. This assumes you can place a cache anywhere along a street you like (on a light pole, for instance). Streets are sufficiently dense that wherever you choose to put a cache, you can probably move it less than 100ft onto the nearest street and not affect proximity in any material way.

The same applies for wilderness, where you can again place a cache more or less anywhere. The interesting case is cultivated rural areas, where most caches are going to be placed along roads and footpaths, but there aren't too many of either, so this can restrict the number of cache placements considerably. Setting up a model for this doesn't feel too hard to me.

I absolutely love this idea!!! Just the kind of thing I can do for this project! :-)

June 4, 2014

What's the minimum area you allow for? If it's a square inch, then I know of a hidden nano cache that has a density of 4,014,489,600 caches per square mile.

Yeah I was thinking about this when I was looking at some of my counties with the highest density of geocaches. They are always the smallest counties in the list..

June 4, 2014

You could exclude Airports, train stations, and schools. As well as any other guideline violating area.

7.8k · June 4, 2014

What's the minimum area you allow for? If it's a square inch, then I know of a hidden nano cache that has a density of 4,014,489,600 caches per square mile.

That's how many physical containers you can fit, but I think the OP is wondering about Groundspeak 528'-separation caches. According to this discussion, apparently you can pack ~120 caches in a square mile assuming no other restrictions.

June 4, 2014

My interest in Geocache Density was sparked when I was reading some discussion posts about Seattle being completely saturated. But then when I was looking at my data, it was no where close to a mathematical maximum density of 118 or so per sq. mile. Nothing has been close to that...

So what makes geocachers call an area saturated? (Other than the few that say an area is saturated only because they are upset they cannot hide the cache they wanted to hide. lol) But the conversation about Seattle seemed to imply that it truly had no more room for geocaches. Is that the case? How is that determined?

I am trying to relate Geocache Density to Population Density in my project as a way to create a mathematical model for predicting the future geocache density of an area based on the future population predictions from the census. My data has shown a correlation coefficient between the two variables of r^2 = 0.6382 which is pretty good for this kind of data I think.

So that was why I was trying to figure out a certain number for the true maximum density so that I could say something like "Since this CITY will reach a population density of X, our model shows it will potentially have a geocache density of Y. Since Y is greater than some MAXIMUM we can conclude that this CITY has the potential of being completely saturated with geocaches in the next # years."

Is this something that I could potentially get to work or am I just going to be going in circles with data that has too many variables to make any true conclusions/predictions?

7.8k · June 4, 2014

...am I just going to be going in circles with data that has too many variables to make any true conclusions/predictions?

I think this is where you're at. Even within a city, the variables will vary widely from block to block. Then you have variations from city to city, state to state, country to country, etc. They're called variables for a good reason!

As for calling Seattle saturated, I think they're using a less literal definition. Rather than being mathematically saturated, where it's literally impossible to fit another cache due to the proximity guideline, I think they're referring to effective saturation. That's where all the "good" spots have been covered. You may be able to fit more caches, but they'd likely be lame (film canister behind a sign on the side of the road, LPC, etc.), illegal (cache on the side of an interstate, private property, etc.), or just ill-advised (hidden near a sensitive location).

June 4, 2014

...am I just going to be going in circles with data that has too many variables to make any true conclusions/predictions?

I think this is where you're at. Even within a city, the variables will vary widely from block to block. Then you have variations from city to city, state to state, country to country, etc. They're called variables for a good reason!

As for calling Seattle saturated, I think they're using a less literal definition. Rather than being mathematically saturated, where it's literally impossible to fit another cache due to the proximity guideline, I think they're referring to effective saturation. That's where all the "good" spots have been covered. You may be able to fit more caches, but they'd likely be lame (film canister behind a sign on the side of the road, LPC, etc.), illegal (cache on the side of an interstate, private property, etc.), or just ill-advised (hidden near a sensitive location).

Sigh... Ok thats what I was starting to think. Just not very helpful for a mathematical approach. Oh well, I'm going to keep working on this if any one has any suggestions. The title of the project is simply A Mathematical Analysis of Geocaching. so there are several directions I could take this...

I have data for over 3,000 counties in the USA including land area (sq. miles), population, geocaches, population density, geocache density...

5.3k · June 4, 2014

I am trying to relate Geocache Density to Population Density in my project as a way to create a mathematical model for predicting the future geocache density of an area based on the future population predictions from the census. My data has shown a correlation coefficient between the two variables of r^2 = 0.6382 which is pretty good for this kind of data I think.

Actually, that is NOT good. I would take it as an indication that the cache density is at most only indirectly related to the population density. Using r^2 or significance values can be very tricky.

Here is a hint that may help you find a better model: the population density in the LA area is roughly constant but in some areas there is a high cache density and in others (e.g. Compton) it is very low.

I would hypothesize that cache density is related to 3 primary factors: population density, park area, and average income.

14k · June 4, 2014

I am trying to relate Geocache Density to Population Density in my project as a way to create a mathematical model for predicting the future geocache density of an area based on the future population predictions from the census. My data has shown a correlation coefficient between the two variables of r^2 = 0.6382 which is pretty good for this kind of data I think.

Actually, that is NOT good. I would take it as an indication that the cache density is at most only indirectly related to the population density.

Take a look outside the U.S. and Europe and you'll find that population density doesn't seem to be related at all. Bangladesh is considered the most densely populated country in the world. There are 3 caches in the entire country.

June 5, 2014

I am trying to relate Geocache Density to Population Density in my project as a way to create a mathematical model for predicting the future geocache density of an area based on the future population predictions from the census. My data has shown a correlation coefficient between the two variables of r^2 = 0.6382 which is pretty good for this kind of data I think.

Actually, that is NOT good. I would take it as an indication that the cache density is at most only indirectly related to the population density.

Take a look outside the U.S. and Europe and you'll find that population density doesn't seem to be related at all. Bangladesh is considered the most densely populated country in the world. There are 3 caches in the entire country.

I am not trying to create a model for all geocaches in the world. Only in the US. There are too many differences in the various countries to do EVERYTHING.

June 5, 2014

I am trying to relate Geocache Density to Population Density in my project as a way to create a mathematical model for predicting the future geocache density of an area based on the future population predictions from the census. My data has shown a correlation coefficient between the two variables of r^2 = 0.6382 which is pretty good for this kind of data I think.

Actually, that is NOT good. I would take it as an indication that the cache density is at most only indirectly related to the population density. Using r^2 or significance values can be very tricky.

Here is a hint that may help you find a better model: the population density in the LA area is roughly constant but in some areas there is a high cache density and in others (e.g. Compton) it is very low.

I would hypothesize that cache density is related to 3 primary factors: population density, park area, and average income.

When I break it down by state, there are many that have an r^2 as high as 0.999. I am not saying this is going to be a perfect model, but it will work for what I am trying to accomplish. And I am certainly not claiming causation by any means, just that there is a relationship. Obviously there are more variables involved than just population density. I have an entire section of my report about other things to include in future research. But there is sooooo much data and soooo many variables, right now I am just focusing on one or two areas.

June 5, 2014

I am trying to relate Geocache Density to Population Density in my project as a way to create a mathematical model for predicting the future geocache density of an area based on the future population predictions from the census. My data has shown a correlation coefficient between the two variables of r^2 = 0.6382 which is pretty good for this kind of data I think.

Actually, that is NOT good. I would take it as an indication that the cache density is at most only indirectly related to the population density. Using r^2 or significance values can be very tricky.

Here is a hint that may help you find a better model: the population density in the LA area is roughly constant but in some areas there is a high cache density and in others (e.g. Compton) it is very low.

I would hypothesize that cache density is related to 3 primary factors: population density, park area, and average income.

When I break it down by state, there are many that have an r^2 as high as 0.999. I am not saying this is going to be a perfect model, but it will work for what I am trying to accomplish. And I am certainly not claiming causation by any means, just that there is a relationship. Obviously there are more variables involved than just population density. I have an entire section of my report about other things to include in future research. But there is sooooo much data and soooo many variables, right now I am just focusing on one or two areas.

If my r^2 = 0.6382 then my r = 0.7989

http://www.drtomoconnor.com/3760/3760lect07.htm

CORRELATION

The most commonly used relational statistic is correlation, and it's a measure of the strength of some relationship between two variables, not causality. Interpretation of a correlation coefficient does not even allow the slightest hint of causality. The most a researcher can say is that the variables share something in common; that is, are related in some way. The more two things have something in common, the more strongly they are related. There can also be negative relations, but the important quality of correlation coefficients is not their sign, but their absolute value. A correlation of -.58 is stronger than a correlation of .43, even though with the former, the relationship is negative. The following table lists the interpretations for various correlation coefficients:

.8 to 1.0 very strong

.6 to .8 strong

.4 to .6 moderate

.2 to .4 weak

.0 to .2 very weak

Edited June 5, 2014 by Rosie_Posie

June 5, 2014

I am trying to relate Geocache Density to Population Density in my project as a way to create a mathematical model for predicting the future geocache density of an area based on the future population predictions from the census. My data has shown a correlation coefficient between the two variables of r^2 = 0.6382 which is pretty good for this kind of data I think.

Actually, that is NOT good. I would take it as an indication that the cache density is at most only indirectly related to the population density.

Take a look outside the U.S. and Europe and you'll find that population density doesn't seem to be related at all. Bangladesh is considered the most densely populated country in the world. There are 3 caches in the entire country.

I am not trying to create a model for all geocaches in the world. Only in the US. There are too many differences in the various countries to do EVERYTHING.

Also, I meant to mention earlier, there is a maximum of sorts on the population density that this equation will work for. When you think of large cities in the US with high population densities, it is because they are literally stacking people on top of each other in large apartment buildings and skyscrapers. You cannot do that with geocaches obviously. So in my paper I discuss the fact that this equation only works in location with a population density less than 3000 or so. After that it does not work as well.

Again, this is certainly not a perfect model. I am simply trying to give some mathematical analysis to something that is quite random at first glance...

June 5, 2014

Something else to try, particularly if you think certain areas are saturated, is to try and make models for the number of caches in an area over time. I'm not sure the historic data is available, but there may be ways round that.

A basic model would be: number of caches at time T is population density * exponential factor * some "culture" factor. Exponential factor is based on the growth of caches worldwide over time, which I think has been roughly exponential. Culture factor relates to things like income, perhaps climate, plus just whether there's been a more or less active caching community locally.

However, if you're nearing saturation, you would expect the number of caches to tail off and approach some limit asymptotically. This cache density should, hopefully, be a reasonable match for the number you got from your modelling, although probably rather lower (certain parts of the region may be off-limits for caching for whatever reason).

The data you get out will be very noisy, for all sorts of reasons, but I'd hope you can get something reasonable out of it.

The complication is that caches get archived, and I don't think it's easy to get data on what caches have previously existed in the region. But you do know what is there now, and publication dates for each cache. Probably a model which just assumes that 20% of the caches in a region will get archived each year, or some appropriate figure, will work for you. There's a further big project to be done to analyse expected cache lifespans, but I'll leave that for now.

June 5, 2014

I am not trying to create a model for all geocaches in the world. Only in the US. There are too many differences in the various countries to do EVERYTHING.

I dunno, students these days, lack of ambition... ***:-)***

The idea is interesting. I think you're going to have to be really clear about your assumptions (think Christaller's assumptions for example) but the correlation figure you have there is interesting certainly. Whether or not there's causation with that I don't know, but it's still an interesting correlation.

I think you also have to consider that really none of the classic spatial models (Weber, Christaller etc...) are ever that reliable once you try to implement them. That's OK - "it's only a model". It doesn't stop the model itself being interesting.

Once you start putting things like urban parks on top of your assumed urban area - and then factor in that some urban parks will ban geocaching (royal parks in London for example, but there are other authorities in the UK which do too) - I would imagine that changes stuff. That's OK - reality is always much more interesting than a model! Obviously with caches you're going to run into major issues if you have multis (especially those with physical stages) and unknown caches at different locations to factor in.

Fwiw, and you may already know of this, there was a study done (I think in the late 70s) of "coffee shops" in Amsterdam looking to see if they were optimally spaced (no pun intended). I seem to recall the spacing was fairly close to that predicted by the model. It's a long, long time since I had to teach that stuff, but there may have been other examples where this sort of thing is seen.

Oh, I don't know if you've seen this tool btw: http://project-gc.com/Maps/mapcoverage/?country=United+States&region=New+York&county=New+York+County+%28NY%29&submit=Filter (you may need to link your account to see it). Zoom in... The same site will allow you to see locations of archived and disabled caches btw.

Good luck.

10.1k · June 5, 2014

A few thoughts (or monkey wrenches?) You think New York County has 290 caches. My Geocaching Op Art has several caches listed as being in New York County, but which are actually in Hudson County. That will knock New York County down one place.

As to population density, Hudson County ranks #6, but does not make your list.

As to population density, why use counties as opposed to municipalities? You will find that the four most densely populated municipalities are in Hudson County, NJ! Guttenberg, Union City, West New York and Hoboken. Yet Guttenberg and Union City have only two caches each! (Of course, Guttenberg is only .19 square mile, so it does not have a lot of room to hide caches... But, I'll see what I can do...)

So: your statistics are interesting, but do they mean anything? And, I would have to ask the reason for three decimal points on cache density? Precision versus accuracy? Is there any meaning to it?

3.3k · June 5, 2014

...am I just going to be going in circles with data that has too many variables to make any true conclusions/predictions?

I think this is where you're at. Even within a city, the variables will vary widely from block to block. Then you have variations from city to city, state to state, country to country, etc. They're called variables for a good reason!

As for calling Seattle saturated, I think they're using a less literal definition. Rather than being mathematically saturated, where it's literally impossible to fit another cache due to the proximity guideline, I think they're referring to effective saturation. That's where all the "good" spots have been covered. You may be able to fit more caches, but they'd likely be lame (film canister behind a sign on the side of the road, LPC, etc.), illegal (cache on the side of an interstate, private property, etc.), or just ill-advised (hidden near a sensitive location).

Sigh... Ok thats what I was starting to think. Just not very helpful for a mathematical approach. Oh well, I'm going to keep working on this if any one has any suggestions. The title of the project is simply A Mathematical Analysis of Geocaching. so there are several directions I could take this...

I have data for over 3,000 counties in the USA including land area (sq. miles), population, geocaches, population density, geocache density...

Pure maths alone won't solve it, but for the sake of mathematical consideration let's assume that there are no restrictions on where a cache may be placed and that every cache placement is sufficiently interesting that someone might place it.

As people have said if you take a theoretical perfect saturation model you'd have caches placed in a hexagonal arrangement with each cache exactly 528 feet from its nearest neighbours. But all it takes is one cache placed other than in the theoretically perfect position and the maximum saturation model breaks. You could theoretically work out the least efficient placement option for a cache, given that if it's 0.2 miles from another cache it's possible to put another one in the middle, if it's 0.199 miles from another cache it blocks out the middle but another cache could be placed slightly off the line between them (which would only marginally disrupt the perfect hexagon arrangement) and so on.

When the real world intrudes you start to get complications that make it impractical or impossible to place caches in some areas (e.g. the middle of a river), illegal to place in other areas (railways, the middle of a major road, an airport etc), and uninteresting to place caches in many other areas (even if light poles in parking areas are 0.1 mile apart I wouldn't want to find Wally World surrounded with half a dozen bison tubes under the skirts of the lighting). So if you wanted to consider an area as large as a state or a county you'd have to do some pretty serious analysis of just what land was available to actually place caches to work out the actual maximum density that was possible for the given area.

Alternatively you could look at individual counties, look at the cache density and compare them to the theoretical maximum assuming no limitations at all, and see how different areas compared to the theoretical maximum. I'd guess urban areas would come closest, simply because in an urban area there are roads everywhere, signs everywhere, and more cachers, so more likelihood that you'll get a cache behind a sign just because the sign is available as a hiding spot. If you've got a few square miles of forest the chances are people aren't going to hide dozens of caches at precise intervals just because they can - even if you get a powertrail it's going to be based on following a line rather than covering an area.

3.3k · June 5, 2014

A few more thoughts.

If you look at the theoretical perfect hexagonal pattern, knocking out one spot as being unsuitable doesn't necessarily change the surrounding six spots, so the presence of an obstacle potentially only reduces the number of possible caches by 1.

If you shift the pattern by up to 0.05 miles in any direction you can potentially overcome impossible spots. So if you've got a road running through the area (let's assume the road is perfectly straight), if it runs along a line of possible spots it knocks them all out but if you shift the pattern slightly you may be able to hide caches every 0.1 mile either side of the road, in which case it doesn't eliminate any spots.

June 9, 2014

A few thoughts (or monkey wrenches?) You think New York County has 290 caches. My Geocaching Op Art has several caches listed as being in New York County, but which are actually in Hudson County. That will knock New York County down one place.

As to population density, Hudson County ranks #6, but does not make your list.

As to population density, why use counties as opposed to municipalities? You will find that the four most densely populated municipalities are in Hudson County, NJ! Guttenberg, Union City, West New York and Hoboken. Yet Guttenberg and Union City have only two caches each! (Of course, Guttenberg is only .19 square mile, so it does not have a lot of room to hide caches... But, I'll see what I can do...)

So: your statistics are interesting, but do they mean anything? And, I would have to ask the reason for three decimal points on cache density? Precision versus accuracy? Is there any meaning to it?

I used counties for two reasons. 1) because I found lots of awesome data for the number of geocaches in each county in the US and so it was easy to access everything. and 2) because counties have nice distinct boundaries that I feel are easier to define.

And I used 3 decimal points because when you get into the less populated areas, say Jamestown, North Dakota, it only had a geocache density of 0.079. My project is not ONLY looking at high saturation areas. It is looking at over 3,000 counties.

June 9, 2014

A few more thoughts.

If you look at the theoretical perfect hexagonal pattern, knocking out one spot as being unsuitable doesn't necessarily change the surrounding six spots, so the presence of an obstacle potentially only reduces the number of possible caches by 1.

If you shift the pattern by up to 0.05 miles in any direction you can potentially overcome impossible spots. So if you've got a road running through the area (let's assume the road is perfectly straight), if it runs along a line of possible spots it knocks them all out but if you shift the pattern slightly you may be able to hide caches every 0.1 mile either side of the road, in which case it doesn't eliminate any spots.

This makes a lot of sense. So this would support the idea that roads, buildings, etc. will probably not make that much difference in terms of a maximum density of an area...

I think what I am going to focus on is the randomness of how people hide them and do a simulation like someone posted earlier. Focus on the human factor more so than the physical obstacle factors.

June 9, 2014

Something else to try, particularly if you think certain areas are saturated, is to try and make models for the number of caches in an area over time. I'm not sure the historic data is available, but there may be ways round that.

A basic model would be: number of caches at time T is population density * exponential factor * some "culture" factor. Exponential factor is based on the growth of caches worldwide over time, which I think has been roughly exponential. Culture factor relates to things like income, perhaps climate, plus just whether there's been a more or less active caching community locally.

However, if you're nearing saturation, you would expect the number of caches to tail off and approach some limit asymptotically. This cache density should, hopefully, be a reasonable match for the number you got from your modelling, although probably rather lower (certain parts of the region may be off-limits for caching for whatever reason).

The data you get out will be very noisy, for all sorts of reasons, but I'd hope you can get something reasonable out of it.

The complication is that caches get archived, and I don't think it's easy to get data on what caches have previously existed in the region. But you do know what is there now, and publication dates for each cache. Probably a model which just assumes that 20% of the caches in a region will get archived each year, or some appropriate figure, will work for you. There's a further big project to be done to analyse expected cache lifespans, but I'll leave that for now.

If I can find a way to get this kind of data, I was definitely thinking of going this direction with part of my project. My problem is tracking down the data....... I can do it, I just need more hours in the day to be able to work on this!!! ahhhh! :-P

3.3k · June 9, 2014

A few more thoughts.

If you look at the theoretical perfect hexagonal pattern, knocking out one spot as being unsuitable doesn't necessarily change the surrounding six spots, so the presence of an obstacle potentially only reduces the number of possible caches by 1.

If you shift the pattern by up to 0.05 miles in any direction you can potentially overcome impossible spots. So if you've got a road running through the area (let's assume the road is perfectly straight), if it runs along a line of possible spots it knocks them all out but if you shift the pattern slightly you may be able to hide caches every 0.1 mile either side of the road, in which case it doesn't eliminate any spots.

This makes a lot of sense. So this would support the idea that roads, buildings, etc. will probably not make that much difference in terms of a maximum density of an area...

I think what I am going to focus on is the randomness of how people hide them and do a simulation like someone posted earlier. Focus on the human factor more so than the physical obstacle factors.

Just to throw another cog in the works, look at where the roads run.

If you've got roads that run perfectly east-west and that are exact multiples of 0.1 miles apart, you can shift your theoretical perfect grid north or south by 0.05 miles and it still works, with caches shifted off to the sides of the roads. Shift the roads so they don't run so neatly and one road could potentially knock out a line of theoretical hiding spots. You still wouldn't necessarily lose all those spots, but you would need to shift part of the grid off the road, meaning you'd end up with the same pattern duplicated with a dead stripe down the middle.

True Maximum Cache Density

Recommended Posts

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Join the conversation