1

I have a data set of 3229 movies, and I need to create random sample of 200 movies released before 2010, and also between 2010 and 2019.

I started with the trying to do the ones for 2009 and below 2010. I managed to make it work for one cell but when I expand it down to 200 rows, the cells whose corresponding year is > 2010 instantly returns FALSE, not sure workaround for this since I'm already indexing through.

Right now I'm using this Formula:

=IF(C2:C3230 < 2010, INDEX(F2:F3230,RANDBETWEEN(1,3229)))

Where C is the column for the years and F corresponds to the titles I want to print into my new sample.

Any help is much appreciated.

Please note, since I Couldn't find an upload button otherwise I would've upload the file also.

Rajesh Sinha
  • 8,995
  • 6
  • 15
  • 35
  • 1
    Better share some sample data will help us to justify the method we would like to suggest !! – Rajesh Sinha Apr 19 '19 at 05:35
  • 2
    Welcome to Super User. The main benefits of sample data are to understand what the data looks like, what your formula is pointing to, and to provide an easy way for people here to create, test, and demonstrate solutions. People are hesitant to download unknown files, so sharing the actual file isn't a great idea. Ideally, create a mock-up (table of text) with just enough sample data to support people trying to help. Short of that, post a screenshot so people can at least see what the data looks like and what's in the referenced columns. (cont'd) – fixer1234 Apr 19 '19 at 05:57
  • 1
    BTW, be aware that 1. RANDBETWEEN recalculates every time anything changes on the worksheet. 2. Your IF expression doesn't contain anything for "else". You provide the INDEX expression for when the IF test is true, but nothing for when it is false. That's likely why you're seeing "false" as the result for years >2010. – fixer1234 Apr 19 '19 at 05:58
  • 1
    Just to share my preliminary thought, create a Pivot Table and extract all rows before 2010. If those count to say 700 then using RAND & RANK create a random but unique list of 700 numbers from 1 thru 700. Search the Internet how to do so, in case you are not familiar. Then use INDEX with first 200 rows random data between 1 thru 700 on the Pivot extracted list. – patkim Apr 19 '19 at 15:17
  • @fixer1234 what would i need to put in the else bit to make it redo the search – Mitul Suchak Apr 19 '19 at 21:36
  • What do you want to happen for records >2010? For example, if you want the result cell to just appear blank, you could use `...INDEX(...),"")` (i.e., append `,""` after the INDEX expression). If you want to put some different value there for those cases, use an expression for the result you want, just like you did with the INDEX expression for the TRUE condition. – fixer1234 Apr 19 '19 at 21:49
  • Question1 : what is the content of `C2:C3230` cells? is it year, or something else? || Question2 : the formula you mentioned is entered as an array formula right? – p._phidot_ Apr 23 '19 at 10:09

0 Answers0