Pandas中重置索引、设置新列名及重命名列内元素问题
Got it, let's tackle your problem step by step. First, I'll fill in the gaps in your original code (since your column definition was truncated) and walk through each of your required operations with clear, actionable code.
1. First, Define Your DataFrame Properly
Let's start by fixing and setting up your initial DataFrame correctly:
from collections import Counter import pandas as pd df = pd.DataFrame( [ ['Directions to Starbucks', 1045], ['Show me directions to Starbucks', 754], ['Give me directions to Starbucks', 612], ['Navigate me to Starbucks', 498], ['Display navigation to Starbucks', 376], ['Direct me to Starbucks', 201], ['Navigate to Starbucks', 180] ], columns=['Utterance', 'Count'] # Completed the column names here )
2. Reset the Index
If you want to reset to a clean 0-based index (and discard any existing non-default index), use reset_index() with drop=True. If you want to keep the old index as a new column, omit drop=True and optionally name the new column with names=:
# Option 1: Reset index and drop the old index entirely df = df.reset_index(drop=True) # Option 2: Reset index and save old index as a new column called "OriginalIndex" # df = df.reset_index(names="OriginalIndex")
3. Rename Your Columns
Use the rename() method to assign new, more descriptive column names. Pass a dictionary mapping old column names to new ones:
df = df.rename(columns={ 'Utterance': 'UserQuery', 'Count': 'QueryFrequency' })
4. Rename String Elements in the Column
To standardize the text in your query column, create a mapping dictionary that groups similar utterances into a single label, then apply it with replace(). For example, let's group all Starbucks navigation queries into one standardized phrase:
# Define your string replacement rules utterance_mapping = { 'Directions to Starbucks': 'Navigate to Starbucks', 'Show me directions to Starbucks': 'Navigate to Starbucks', 'Give me directions to Starbucks': 'Navigate to Starbucks', 'Navigate me to Starbucks': 'Navigate to Starbucks', 'Display navigation to Starbucks': 'Navigate to Starbucks', 'Direct me to Starbucks': 'Navigate to Starbucks' } # Apply the mapping to the UserQuery column df['UserQuery'] = df['UserQuery'].replace(utterance_mapping)
Full Combined Code (With Optional Aggregation)
If you want to sum the counts for identical standardized queries, add a groupby() step at the end:
from collections import Counter import pandas as pd # Original DataFrame df = pd.DataFrame( [ ['Directions to Starbucks', 1045], ['Show me directions to Starbucks', 754], ['Give me directions to Starbucks', 612], ['Navigate me to Starbucks', 498], ['Display navigation to Starbucks', 376], ['Direct me to Starbucks', 201], ['Navigate to Starbucks', 180] ], columns=['Utterance', 'Count'] ) # Step 1: Reset index df = df.reset_index(drop=True) # Step 2: Rename columns df = df.rename(columns={'Utterance': 'UserQuery', 'Count': 'QueryFrequency'}) # Step 3: Standardize string values utterance_mapping = { 'Directions to Starbucks': 'Navigate to Starbucks', 'Show me directions to Starbucks': 'Navigate to Starbucks', 'Give me directions to Starbucks': 'Navigate to Starbucks', 'Navigate me to Starbucks': 'Navigate to Starbucks', 'Display navigation to Starbucks': 'Navigate to Starbucks', 'Direct me to Starbucks': 'Navigate to Starbucks' } df['UserQuery'] = df['UserQuery'].replace(utterance_mapping) # Optional: Aggregate counts for identical queries df_aggregated = df.groupby('UserQuery').sum().reset_index() print(df_aggregated)
Sample Output
After running the aggregated code, you'll get this clean result:
UserQuery QueryFrequency 0 Navigate to Starbucks 3666
Feel free to adjust the mapping, column names, or index behavior to fit your exact use case!
内容的提问来源于stack exchange,提问作者user_seaweed




