Chapter 14 Data
14.1 SAPA project website
14.1.1 Introduction
The SAPA Project hosts surveys that explore different dimensions of personality.
14.1.1.1 Infrastructure
The SAPA Project is hosted on Google Cloud. There is one active compute engine virtual machine (VM) instance, and one backup or staging instance. The VM instances run an Apache/2.4.25 web server on Debian 9 (stretch) Linux based operating system that hosts the actual SAPA Project website. Each time a user completes a survey, the survey results are stored in a MySQL database, which is running on a Google Cloud SQL instance running MySQL 5.7.
14.1.1.1.1 Admin console
Admin console for the computing infrastructure is available at https://console.cloud.google.com.
14.1.1.1.2 How to add new administrators / owners
Add a new user from the IAM & Admin admin console page.
14.1.1.2 How to connect to database
First time users:
- Download and install the Google Cloud SDK
- Download and install the Google Cloud SQL Proxy
- Download and install a MySQL client. On macOS, use homebrew to install a command line client via
brew install mysql
, or you can install the MySQL Workbench GUI client
- In a terminal session, authenticate to Google services
gcloud auth login
A web browser will open and direct you to login to your Google account.
- Start the
cloud_sql_proxy
:
cloud_sql_proxy -instances=silken-alloy-248920:us-west1:woodworth-pi-improvement-project-1920=tcp:3306
If the command succeeds, you’ll see the following message returned to your shell:
Ready for new connections
The Cloud SQL Proxy allows you to make connections to the MySQL database running in Google Cloud as if the database was running locally.
- In a new terminal window, verify the connection:
mysql -h 127.0.0.1 -u SAPAreader -p
Enter the password when prompted. If the connection is successful, then you’ll see the mysql prompt:
mysql>
14.1.1.3 Connecting from R
Following the instructions here: https://cloud.google.com/blog/products/gcp/google-cloud-platform-for-data-scientists-using-r-with-google-cloud-sql-for-mysql
It is helpful to create a function that creates the connection to the MySQL database:
library(RMySQL)
<- function(){
getSqlConnection <-
con dbConnect(
::MySQL(),
RMySQLusername = 'SAPAreader',
password = 'password',
host = '127.0.0.1',
dbname = 'SAPAactive'
)return(con)
}
Then you can query any table in the SAPAactive
database by following this example, which queries 10 results from the TAIE_responses_111119
table:
<- getSqlConnection()
conn <- dbSendQuery(conn, "select * from TAIE_responses_111119 limit 10")
res <- dbFetch(res)
data print(data)
dbDisconnect(conn)
More forthcoming