数据库应用 --- Yelp Data Analysis Application
Overview
Basic Info
Functionality
初始GUI
Simple Business Search
选择main category 筛选出对应sub category, 选择sub category筛选出对应 attributes
点击execute business search显示符合条件的business
sub-category和attributes是optional的, 也就是不选择subCategory和attribute也可以查询business
在result中点击一条business,可以显示出给这个business点评过的所有用户
点击一个写过点评的用户可以显示出点评内容
Simple User Search
可以根据注册日期, Review Count的数量, 好友数量, 评价打分,投票数筛选用户
点击Execute User Search显示查找结果,点击一个用户显示此用户的所有点评,点击一条点评显示具体内容
筛选review
在review栏中添加筛选条件,可以筛选review (这个用户之前有6个review,现在只有一个了
同样可以筛选business 的review
这个business之前有很多review现在只有三条
OR AND
- business search 和 user search有AND和OR属性
- business search: 表示main category, sub category, attributes之间是and关系或者or关系
- user search: 表示user search的各个条件是and或者or关系
Implementation
JSON files — source data
business.json
user.json
review.json
create.db — create tables
CREATE TABLE BUSINESS (
business_id VARCHAR(25) PRIMARY KEY,
full_address VARCHAR(200) NOT NULL,
ifOpen VARCHAR(5),
city VARCHAR(50) NOT NULL,
state_name VARCHAR(20) NOT NULL,
review_count NUMBER,
business_name VARCHAR(100) NOT NULL,
starts NUMBER
);
CREATE TABLE BUSINESS_MAIN_CATEGORIES(
business_id VARCHAR(25),
main_category VARCHAR(50),
PRIMARY KEY (business_id, main_category),
FOREIGN KEY(business_id) REFERENCES BUSINESS(business_id)
);
CREATE TABLE BUSINESS_SUB_CATEGORIES (
business_id VARCHAR(25),
sub_category VARCHAR(50),
PRIMARY KEY (business_id, sub_category),
FOREIGN KEY(business_id) REFERENCES BUSINESS(business_id)
);
CREATE TABLE BUSINESS_ATTRIBUTES (
business_id VARCHAR(25),
attributes VARCHAR(50),
PRIMARY KEY (business_id, attributes),
FOREIGN KEY(business_id) REFERENCES BUSINESS(business_id)
);
CREATE TABLE YELP_USERS (
yelping_since DATE,
num_votes NUMBER,
review_count NUMBER,
user_name VARCHAR(100),
user_id VARCHAR(25) PRIMARY KEY,
average_stars NUMBER,
numFriend NUMBER
);
CREATE TABLE REVIEW (
num_votes NUMBER,
user_id VARCHAR(25) NOT NULL,
review_id VARCHAR(25) PRIMARY KEY,
stars NUMBER,
review_date DATE,
review_text Long,
business_id VARCHAR(25) NOT NULL,
FOREIGN KEY(business_id) REFERENCES BUSINESS(business_id),
FOREIGN KEY(user_id) REFERENCES YELP_USERS(user_id)
);
Populate.Java — populate data
JDBC connection
从JSON文件提取数据
使用simpleJson — import org.json.simple.
Example: review.json
//transform every line of input to json object
JSONParser parser = new JSONParser();
JSONObject jsonObject = (JSONObject) parser.parse(line);
//retrieve json values from the first layer of the json object
String text = (String) jsonObject.get("text");
String user_id = (String) jsonObject.get("user_id");
String review_id = (String) jsonObject.get("review_id");
//get stars
long stars = (long)jsonObject.get("stars");
//get date
String date_string = (String) jsonObject.get("date");
SimpleDateFormat sdf1 = new SimpleDateFormat("yyyy-MM-dd");
java.util.Date javaDate = sdf1.parse(date_string);
java.sql.Date sqlDate = new java.sql.Date(javaDate.getTime());
//get business id
String business_id = (String) jsonObject.get("business_id");
// get votes
JSONObject votes = (JSONObject) jsonObject.get("votes");
long funny = (long) votes.get("funny");
long useful = (long) votes.get("useful");
long cool = (long) votes.get("cool");
long numVotes = funny + useful + cool;
hw3.java — GUI + SQL query
构建GUI
声明swing组件
Example:
///main panel/
private JPanel mainPanel = new JPanel();
///under main panel///
//business panel
private JPanel businessPanel = new JPanel();
private JLabel businessLabel = new JLabel();
//review panel
private JPanel reviewPanel = new JPanel();
private JLabel reviewLabel = new JLabel();
/user panel
private JPanel userPanel = new JPanel();
private JLabel userLabel = new JLabel();
/result panel
private JPanel resultPanel = new JPanel();
private JLabel resultLabel = new JLabel();
初始化GUI
//set the frame
setDefaultCloseOperation(WindowConstants.EXIT_ON_CLOSE);
setResizable(false);
setDefaultCloseOperation(WindowConstants.EXIT_ON_CLOSE);
setResizable(false);
///main panel
mainPanel.setBackground(new Color(0, 255, 255
mainPanel.setPreferredSize(new Dimension(1300, 900));
mainPanel.setLayout(new GridLayout(2, 2)); //divide the window with four equal parts
businessPanel();
reviewPanel();
userPanel();
resultPanel();
Business Query
使用HashSet储存选择的main category, sub category,attributes和查询出来的main category, sub category,attributes
private HashSet<String> queriedMainCat = new HashSet<>();
private HashSet<String> selectedMainCat = new HashSet<>();
private HashSet<String> queriedSubCat = new HashSet<>();
private HashSet<String> selectedSubCat = new HashSet<>();
private HashSet<String> queriedAttributes = new HashSet<>();
private HashSet<String> selectedAttributes = new HashSet<>();
display Main Category
打开app后显示固定的Main Category, 需要的SQL查询如下:
SELECT DISTINCT main_category, business_id
FROM BUSINESS_MAIN_CATEGORIES
private void initMainCat() {
selectedMainCat.clear();
StringBuilder sqlQuery = new StringBuilder();
sqlQuery.append("SELECT DISTINCT main_category, business_id").append("\n")
.append("FROM BUSINESS_MAIN_CATEGORIES\n");
System.out.println("---------Prepare SQL query for main categories--------");
System.out.println(sqlQuery);
try(Connection connection = getConnection()) {
//get all the main categories
PreparedStatement queryMainCategoryStm = connection.prepareStatement(sqlQuery.toString());
ResultSet results = queryMainCategoryStm.executeQuery();
System.out.println("---------SQL query send--------");
//add main categories to queriedMainCat
while (results.next()) {
queriedMainCat.add(results.getString("main_category"));
}
} catch (SQLException throwables) {
throwables.printStackTrace();
} catch (ClassNotFoundException e) {
e.printStackTrace();
}
//display all the main categories and keep records of selected main categories
updateCategoryScreen(mainCategoryListPanel, queriedMainCat, selectedMainCat);
System.out.println("---------Display main categories--------\n");
//listener for clicking the select button
mainSelectButton.addActionListener(new ActionListener() {
@Override
public void actionPerformed(ActionEvent e) {
searchWithMainCat("select");
}
});
}
search with main category
上图点击select会发出
SELECT DISTINCT sub_category
FROM BUSINESS_SUB_CATEGORIES b
WHERE b.business_id in (
SELECT DISTINCT business_id
FROM BUSINESS_MAIN_CATEGORIES t
WHERE t.main_category in ('Convenience Stores', 'Dentists')
GROUP BY business_id
HAVING COUNT(business_id) = 2
)
当查询关系为AND的时候
GROUP BY business_id
将相同的business_id分组
HAVING COUNT(business_id) = 2
计算每个分组中相同的business_id的数量,挑选count = 2的business_id, 因为选择了两个main category
这样只选择在两个category都出现过的business_id
实现 AND 的关系
上图点击execute business search会发出
SELECT b.business_name, b.business_id, b.state_name, b.city, b.starts
FROM business b
WHERE b.business_id in (
SELECT DISTINCT business_id
FROM BUSINESS_MAIN_CATEGORIES t
WHERE t.main_category in ('Convenience Stores', 'Dentists')
GROUP BY business_id
HAVING COUNT(business_id) = 2
)
Implementation
private void searchWithMainCat(String clickType) {
if (businessSearchComboBox.getSelectedItem() == "Choose AND or OR for the search") {
JOptionPane.showMessageDialog(null, "Please select an operation: AND, OR");
return;
}
queriedSubCat.clear();
subCategoryListPanel.removeAll();
StringBuilder sqlQuery = new StringBuilder();
searchWithMainCategorySQL(sqlQuery, clickType);
try(Connection connection = getConnection()) {
Statement queryUserStm = connection.createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE,
ResultSet.CONCUR_READ_ONLY);
ResultSet results = queryUserStm.executeQuery(sqlQuery.toString());
if (clickType == "execute") {
String[] columns = new String[]{"Business_name", "Business_id", "State_name", "City", "Stars"};
displayResults(results, columns, resultTableForBusiness, extraPanelForResult, "business");
}
else {
while (results.next()) {
String subCategory = results.getString("sub_category");
queriedSubCat.add(subCategory);
}
}
} catch (SQLException throwables) {
throwables.printStackTrace();
} catch (ClassNotFoundException e) {
e.printStackTrace();
}
//display all the sub categories and keep records of selected sub categories
updateCategoryScreen(subCategoryListPanel, queriedSubCat, selectedSubCat);
System.out.println("---------Display sub categories--------");
//listener for subSelectButton
subSelectButton.addActionListener(new ActionListener() {
@Override
public void actionPerformed(ActionEvent e) {
searchWithSubCat("select");
}
});
}
private void searchWithMainCategorySQL(StringBuilder sqlQuery, String clickType) {
if (clickType == "execute") {
sqlQuery.append("SELECT b.business_name, b.business_id, b.state_name, b.city, b.starts\n")
.append("FROM business b").append("\n");
}
else if (clickType == "select") {
selectedSubCat.clear();
selectedAttributes.clear();
System.out.println("selected sub category cleared");
System.out.println("selected attributes cleared\n");
sqlQuery.append("SELECT DISTINCT sub_category\n")
.append("FROM BUSINESS_SUB_CATEGORIES b").append("\n");
}
sqlQuery.append("WHERE b.business_id in (\n");
sqlTemplate(sqlQuery, "mainCategory", selectedMainCat);
sqlQuery.append("\n)");
System.out.println(sqlQuery);
System.out.println("---------SQL query send--------" + "\n");
}
search with sub category
上图点击select会发出
SELECT DISTINCT attributes
FROM BUSINESS_ATTRIBUTES b
WHERE b.business_id in (
SELECT DISTINCT business_id
FROM BUSINESS_MAIN_CATEGORIES t
WHERE t.main_category in ('Convenience Stores', 'Dentists')
INTERSECT
SELECT DISTINCT business_id
FROM BUSINESS_SUB_CATEGORIES t
WHERE t.sub_category in ('Day Spas', 'Cosmetic Dentists', 'Endodontists')
)
上图点击execute business search会发出
SELECT b.business_name, b.business_id, b.state_name, b.city, b.starts
FROM business b
WHERE b.business_id in (
SELECT DISTINCT business_id
FROM BUSINESS_MAIN_CATEGORIES t
WHERE t.main_category in ('Convenience Stores', 'Dentists')
INTERSECT
SELECT DISTINCT business_id
FROM BUSINESS_SUB_CATEGORIES t
WHERE t.sub_category in ('Day Spas', 'Cosmetic Dentists', 'Endodontists')
)
Implementation
private void searchWithSubCat(String clickType) {
queriedAttributes.clear();
attributeListPanel.removeAll();
StringBuilder sqlQuery = new StringBuilder();
searchWithSubCategorySQL(sqlQuery, clickType);
System.out.println(sqlQuery.toString());
try(Connection connection = getConnection()) {
Statement queryUserStm = connection.createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE,
ResultSet.CONCUR_READ_ONLY);
ResultSet results = queryUserStm.executeQuery(sqlQuery.toString());
System.out.println("---------SQL query send--------");
if (clickType == "execute") {
String[] columns = new String[]{"Business_name", "Business_id", "State_name", "City", "Stars"};
displayResults(results, columns, resultTableForBusiness, extraPanelForResult, "business");
}
else {
while (results.next()) {
String attributes = results.getString("attributes");
queriedAttributes.add(attributes);
}
}
} catch (SQLException throwables) {
throwables.printStackTrace();
} catch (ClassNotFoundException e) {
e.printStackTrace();
}
//display all the attributes and keep records of selected attributes
updateCategoryScreen(attributeListPanel, queriedAttributes, selectedAttributes);
}
private void searchWithSubCategorySQL (StringBuilder sqlQuery, String clickType) {
if (clickType == "execute") {
sqlQuery.append("SELECT b.business_name, b.business_id, b.state_name, b.city, b.starts\n")
.append("FROM business b").append("\n");
}
else if (clickType == "select") {
selectedAttributes.clear();
System.out.println("selected attributes cleared\n");
sqlQuery.append("SELECT DISTINCT attributes\n")
.append("FROM BUSINESS_ATTRIBUTES b").append("\n");
}
sqlQuery.append("WHERE b.business_id in (\n");
sqlTemplate(sqlQuery, "mainCategory", selectedMainCat);
sqlQuery.append("\n INTERSECT \n");
sqlTemplate(sqlQuery, "subCategory", selectedSubCat);
sqlQuery.append("\n)");
System.out.println(sqlQuery);
System.out.println("---------SQL query send--------" + "\n");
}
search with attributes
上图点击execute business search会发出
SELECT b.business_name, b.business_id, b.state_name, b.city, b.starts
FROM business b
WHERE b.business_id in (
SELECT DISTINCT business_id
FROM BUSINESS_MAIN_CATEGORIES t
WHERE t.main_category in ('Convenience Stores', 'Dentists')
GROUP BY business_id
HAVING COUNT(business_id) = 2
INTERSECT
SELECT DISTINCT business_id
FROM BUSINESS_SUB_CATEGORIES t
WHERE t.sub_category in ('Day Spas', 'Cosmetic Dentists', 'Endodontists')
GROUP BY business_id
HAVING COUNT(business_id) = 3
INTERSECT
SELECT DISTINCT business_id
FROM BUSINESS_ATTRIBUTES t
WHERE t.attributes in ('Parking_lot_false', 'Parking_validated_false', 'Parking_street_false')
GROUP BY business_id
HAVING COUNT(business_id) = 3
)
Implementation
private void searchWithAttribute(String clickType) {
StringBuilder sqlQuery = new StringBuilder();
searchWithAttributesSQL(sqlQuery, clickType);
try (Connection connection = getConnection()) {
Statement queryUserStm = connection.createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE,
ResultSet.CONCUR_READ_ONLY);
ResultSet results = queryUserStm.executeQuery(sqlQuery.toString());
String[] columns = new String[]{"Business_name", "Business_id", "State_name", "City", "Stars"};
displayResults(results, columns, resultTableForBusiness, extraPanelForResult, "business");
} catch (SQLException throwables) {
throwables.printStackTrace();
} catch (ClassNotFoundException e) {
e.printStackTrace();
}
}
private void searchWithAttributesSQL(StringBuilder sqlQuery, String clickType) {
if (clickType == "execute") {
sqlQuery.append("SELECT b.business_name, b.business_id, b.state_name, b.city, b.starts\n")
.append("FROM business b").append("\n");
}
sqlQuery.append("WHERE b.business_id in (\n");
sqlTemplate(sqlQuery, "mainCategory", selectedMainCat);
sqlQuery.append("\n INTERSECT \n");
sqlTemplate(sqlQuery, "subCategory", selectedSubCat);
sqlQuery.append("\n INTERSECT \n");
sqlTemplate(sqlQuery, "attributes", selectedAttributes);
sqlQuery.append("\n)");
System.out.println(sqlQuery);
System.out.println("---------SQL query send--------" + "\n");
}
common part for building all SQL queries
private void sqlTemplate(StringBuilder sqlQuery, String filterType, HashSet<String> selectedItems) {
String tableName = "";
String elementName = "";
if (filterType == "mainCategory") {
tableName = "BUSINESS_MAIN_CATEGORIES t";
elementName = "main_category";
}
else if (filterType == "subCategory") {
tableName = "BUSINESS_SUB_CATEGORIES t";
elementName = "sub_category";
}
else if (filterType == "attributes") {
tableName = "BUSINESS_ATTRIBUTES t";
elementName = "attributes";
}
Iterator it = selectedItems.iterator();
StringBuilder selectedItemString = new StringBuilder();
while (it.hasNext()) {
String item = it.next().toString();
item = item.replace("'", "''");
if (it.hasNext()) {
selectedItemString.append("'").append(item).append("'").append(",").append(" ");
}
else {
selectedItemString.append("'").append(item).append("'");
}
}
int count = selectedItems.size();
sqlQuery.append("SELECT DISTINCT business_id \n");
sqlQuery.append("FROM ").append(tableName).append("\n");
sqlQuery.append("WHERE ").append("t.").append(elementName).append(" in ")
.append("(");
sqlQuery.append(selectedItemString.toString());
sqlQuery.append(")\n");
if (businessSearchComboBox.getSelectedItem() == "AND") {
sqlQuery.append("GROUP BY business_id\n");
sqlQuery.append("HAVING COUNT(business_id) = ").append(String.valueOf(count));
}
}
User Query
点击Execute User Search会发出
SELECT *
FROM YELP_USERS
WHERE review_count >10
OR numFriend <100
OR average_stars >2
OR num_votes >10
Implementation
private void userSearch() {
String userLogicOperator = userSearchComboBox.getSelectedItem().toString();
String memberSince = memberSinceTextField.getText();
String reviewCntOperator = reviewCntComboBox.getSelectedItem().toString();
String reviewCntValue = reviewCntTextField.getText();
String numFriendOpertor = numFriendsComboBox.getSelectedItem().toString();
String numFriendsValue = numFriendsTextField.getText();
String avgStarsOperator = avgStarsComboBox.getSelectedItem().toString();
String avgStarValue = avgStarsTextField.getText();
String numVotesOperator = numVotesComboBox.getSelectedItem().toString();
String numVotesValue = numVotesTextField.getText();
StringBuilder sqlQuery = new StringBuilder();
sqlQuery.append("SELECT *").append("\n")
.append("FROM YELP_USERS").append("\n");
boolean firstCondition = true;
if (isValidDate(memberSince)) {
ifFirstCondition(firstCondition, sqlQuery, userLogicOperator);
sqlQuery.append("yelping_since").append(" >= ").append("TO_DATE('").append(memberSince).append("','")
.append("YYYY-MM-DD')")
.append("\n");
firstCondition = false;
}
if (reviewCntOperator != "=, >, <" && isValidNumber(reviewCntValue)) {
ifFirstCondition(firstCondition, sqlQuery, userLogicOperator);
sqlQuery.append(" review_count ")
.append(reviewCntOperator).append(reviewCntValue).append("\n");
firstCondition = false;
}
if (numFriendOpertor != "=, >, <" && isValidNumber(numFriendsValue)) {
ifFirstCondition(firstCondition, sqlQuery, userLogicOperator);
sqlQuery.append(" numFriend ")
.append(numFriendOpertor).append(numFriendsValue).append("\n");
firstCondition = false;
}
if (avgStarsOperator != "=, >, <" && isValidNumber(avgStarValue)) {
ifFirstCondition(firstCondition, sqlQuery, userLogicOperator);
sqlQuery.append(" average_stars ")
.append(avgStarsOperator).append(avgStarValue).append("\n");
firstCondition = false;
}
if (numVotesOperator != "=, >, <" && isValidNumber(numVotesValue )) {
ifFirstCondition(firstCondition, sqlQuery, userLogicOperator);
sqlQuery.append(" num_votes ")
.append(numVotesOperator).append(numVotesValue);
firstCondition = false;
}
System.out.println(sqlQuery.toString());
//display the results
try (Connection connection = getConnection()){
Statement queryUserStm = connection.createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE,
ResultSet.CONCUR_READ_ONLY);
ResultSet results = queryUserStm.executeQuery(sqlQuery.toString());
String [] columns = new String [] {"Joined Date", "Num Votes", "Review Cnt",
"User Name", "User Id", "Avg Stars", "Num Friends"};
displayResults(results, columns, resultTableForUser, extraPanelForResult, "user");
} catch (SQLException | ClassNotFoundException e) {
e.printStackTrace();
}
}
Review Query
设置好review筛选条件后,点击查询结果中任意的一条business或者user会发出SQL查询
如下图会发出
SELECT *
FROM REVIEW
WHERE user_id = 'uygg55wWaEP0xTR7dHDg-g'
AND ( stars >2
OR num_votes <100
)
Implementation
private StringBuilder getReviewList(String type, String id, String [] columns, StringBuilder sqlQuery) {
String idType = "";
String logicOperator = "";
String dateFrom = dateFromTextField.getText();
String dateTo = dateToTextField.getText();
String reviewStars = starsTextField.getText();
String reviewStarsOperator = starsComboBox.getSelectedItem().toString();
String reviewVotes = votesTextField.getText();
String reviewVotesOperator = votesComboBox.getSelectedItem().toString();
if (type == "business") {
logicOperator = businessSearchComboBox.getSelectedItem().toString();
idType = "business_id";
}
else if (type == "user") {
logicOperator = userSearchComboBox.getSelectedItem().toString();
idType = "user_id";
}
sqlQuery.append("SELECT *").append("\n")
.append("FROM REVIEW").append("\n")
.append("WHERE ").append(idType).append(" = '").append(id).append("'\n");
boolean firstCondition = true;
if (isValidDate(dateFrom)) {
if (firstCondition) {
sqlQuery.append("AND (");
}
else {
sqlQuery.append(logicOperator);
}
sqlQuery.append(" review_date").append(" >= ")
.append(" TO_DATE('").append(dateFrom).append("','").append("YYYY-MM-DD')")
.append("\n");
firstCondition = false;
}
if (isValidDate(dateTo)) {
if (firstCondition) {
sqlQuery.append("AND (");
}
else {
sqlQuery.append(logicOperator);
}
sqlQuery.append(" review_date").append(" <= ")
.append(" TO_DATE('").append(dateTo).append("','").append("YYYY-MM-DD')")
.append("\n");
firstCondition = false;
}
if (reviewStarsOperator != "=, >, <" && isValidNumber(reviewStars)) {
if (firstCondition) {
sqlQuery.append("AND (");
}
else {
sqlQuery.append(logicOperator);
}
sqlQuery.append(" stars ")
.append(reviewStarsOperator).append(reviewStars).append("\n");
firstCondition = false;
}
if (reviewVotesOperator != "=, >, <" && isValidNumber(reviewVotes)) {
if (firstCondition) {
sqlQuery.append("AND (");
}
else {
sqlQuery.append(logicOperator);
}
sqlQuery.append(" num_votes ")
.append(reviewVotesOperator).append(reviewVotes).append("\n");
firstCondition = false;
}
if (!firstCondition) {
sqlQuery.append(")");
}
System.out.println(sqlQuery.toString());
return sqlQuery;
}
Notes
- 如何写SQL 查询语句会极大的影响查询速度和代码复杂度