Sci2 Manual : 5.3 Global Level Studies - Macro

5.3.1 Geo USPTO (SDB Data)



Time frame:




Topical Area(s):


Analysis Type(s):

Geospatial Analysis

 The file 'usptoInfluenza.csv' was generated with an SDB search for patents containing the term "Influenza", and was heavily modified to produce a simple geographic table. Load it using 'File > Load'' and following this path: 'yoursci2directory/sampledata/geo/usptoInfluenza.csv' (if the file is not in the sample data directory it can be downloaded from 2.5 Sample Datasets). Then select "Standard csv format." See the data format in Figure 5.37 (left). Once loaded, select the dataset in the Data Manager and run Visualization > Geospatial > Proportional Symbol Map using the parameters in Figure 5.37 (right). The tool will output a PostScript visualization which can be viewed using GhostView (see section 2.4 Saving Visualizations for Publication and Figure 5.38).

Figure 5.37: Geospatial workflow with usptoinfluenza.csv data (left) and Geo Map parameters (right).

Figure 5.38: Geospatial map (circle annotations) of USPTO patent influemza data

To create a geospatial map with region coding, select usptoinfluenza.csv once again and then select Visualization > Geospatial > Choropleth Map. Use the following parameters:

Figure 5.39: Geospatial map (Colored-Region) of USPTO Patent influenza data.

To see the log file from this workflow save the 5.3.1 Geo USPTO (SDB Data) log file.

One can also create a US Geo Map with customized data by running the same workflow but selecting "US States" in "Map", see below.

Figure 5.40: US map with area color coding and circle coding for aggregated data over states.

There are two available size scaling options, "Linear" and "Logarithmic". We recommend using logarithmic scaling for larger datasets.

Figure 5.41: US geospatialmap of state-level data with logarithmic circle size scaling (left) and circle linear size scaling (right).

5.3.2 Congressional District Geocoder

zip code.csv



United States

Analysis Type(s):

Geospatial Analaysis

To visualize Congressional Districts you must first extract that data from a dataset containing either ZIP codes or addresses. You can download the Congressional District Geocoder plugin here. You can load any file that contains 9-digit U.S. ZIP codes to be geocoded. A sample file can be loaded by using 'File > Load' and following this path: 'yoursic2directory/sampledat/geo/zipcode.csv' (if the file is not in the sample data directory it can be downloaded from 2.5 Sample Datasets). Load the file in Standard csv format.Then select the file in the data manager and use 'Analysis > Geospatial > Congressional District Geocoder' with the following parameters:

5-digits ZIP codes with multiple congressional districts, empty entries and invalid ZIP codes that failed to be geocoded will list in warning messages on the console.The output table contains all columns of the input table with three additional columns appended: Congressional district, latitude, and longitude.To view the output table save the file using 'File > Save...' and selecting the desired save location (to view the file in Excel save it as a csv file). Once the file has been saved it can be viewed with your choice of program. Below the file has been opened as a csv file:


Before you you can visualize this data you will need to edit the csv file (shown above). It will be easiest to edit the file with Excel. First save the file from the data manager in Sci2 to a desired location. Open the file with Excel and add a column titled "Circle size". The value you chose does not matter, but it should be consistent across congressional districts. (Note: the smaller the value the more precise the visualization will be with regard to Congressional District.) Below 0.5 was chosen for circle size:

Once the csv file has been edited, reload it into Sci2. To visualize the newly loaded dataset, select the file in the data manager. Then select Visualization > Geospatial > Proportional Symbol Map and use the following parameters:

Note: to color the interior of the circles you must select a value for the "Color Circle Interior By" tab. Here Circle size was selected. This value is arbitrary, but it's consistency results in consistent coloring for the final visualization: