提交 · fix-sms-tlbresp-20230619 · OpenXiangShan / XiangShan

19 6月, 2023 1 次提交
- G
  
  Memblock: Fix SMS prefetch path · cdbfd884
  由 good-circle 提交于 6月 19, 2023
  
  cdbfd884
15 6月, 2023 1 次提交
- S
  LQ: fix replay logic for 3ld2st (#2136) · 44cbc983
  由 sfencevma 提交于 6月 15, 2023
```
Co-authored-by: NLyn <lyn@Lyns-MacBook-Pro.local>
```
  44cbc983
13 6月, 2023 2 次提交
- S
  FreeList: fix freelist for 3ld2st (#2133) · caaadfbe
  由 sfencevma 提交于 6月 13, 2023
```
Co-authored-by: NLyn <lyn@Lyns-MacBook-Pro.local>
```
  caaadfbe
- S
  LQ: Optimizing LoadQueueReplay replay timing (#2127) · 8a610956
  由 sfencevma 提交于 6月 13, 2023
```
* Replay cycles increased from 2 to 3 cycles
* Simplified replay selection logic
```
  8a610956
12 6月, 2023 7 次提交

DCache: fix ecc response timing (#2130) · fe46839f

由 Maxpicca 提交于 6月 12, 2023

* dcache: fix the timing coupling of `ecc_resp` and `s1_tag_match`

* dcache: fix bug in cacheOp's ecc

* dcache: fix bug of compilation

fe46839f

DCache: fix ecc response timing (#2130) · 4e223ee4

由 Maxpicca 提交于 6月 12, 2023

* dcache: fix the timing coupling of `ecc_resp` and `s1_tag_match`

* dcache: fix bug in cacheOp's ecc

* dcache: fix bug of compilation

4e223ee4

S

LQ, freelist: remove enqOffset for 3ld2st (#2121) · bd65812f
由 sfencevma 提交于 6月 12, 2023

bd65812f

MissQueue: Optimizing enqueue timing (#2119) · 6b5c3d02

由 happy-lx 提交于 6月 12, 2023

* dcache: split missqueue enq logic

Now, the miss request entering the missqueue is split into two
cycles, the first cycle determines whether it can enq or merge, and the
second cycle does the actual data update.

In order to send  acquire request to L2 as quickly as possible, the
pipeline register also sends acquire when the situation allows. If
it sends successfully, the s_acquire does not need to be updated to false when
updating MSHR

* missqueue: adjust priority

Make acquire from pipereg have highest priority

* dcache: add some pf counter

* missqueue: fix acquire source in pipeline reg

6b5c3d02

H

SMS: Regnext tlb req from arbiter for better timing (#2122) · 375a3f86
由 Haoyuan Feng 提交于 6月 12, 2023

375a3f86
S
LQ: fix rar release check, remove delay cycle (#2120) · 4ab5d137
由 sfencevma 提交于 6月 12, 2023
```
* In latest design, delay release check will not happen.
```
4ab5d137
X
SQ: RegNext cancelcount for better timing (#2126) · 50cb93ff
由 xinyao zheng 提交于 6月 12, 2023
```
* CancelCount to EngPtr violates the timing requirement
* Adding one cycle by regnext for better timing.
```
50cb93ff

09 6月, 2023 1 次提交

dcache: cache line level sram bank and fine-grained rw bank conflict check (#2099) · 3eeae490

由 Maxpicca 提交于 6月 09, 2023

* Divide dcache sram into N parts above 8 banks in a cache line.
    * N is configurable, and when it is 1, it is the original config.
* Fine-grained read-write bank conflicts base on dcache divide.

3eeae490

06 6月, 2023 1 次提交

Disable chiselDB by default to minimize the size of DB (#2118) · 62129679

由 wakafa 提交于 6月 06, 2023

* config: disable chiseldb by default to minimize db size

* note that tllog is still enabled when alwaysBasicDB is set

* bump huancun & utility

62129679

02 6月, 2023 2 次提交

top-down: align top-down with Gem5 (#2085) · d2b20d1a

由 Tang Haojin 提交于 6月 02, 2023

* topdown: add defines of topdown counters enum

* redirect: add redirect type for perf

* top-down: add stallReason IOs

frontend -> ctrlBlock -> decode -> rename -> dispatch

* top-down: add dummy connections

* top-down: update TopdownCounters

* top-down: imp backend analysis and counter dump

* top-down: add HartId in `addSource`

* top-down: broadcast lqIdx of ROB head

* top-down: frontend signal done

* top-down: add memblock topdown interface

* Bump HuanCun: add TopDownMonitor

* top-down: receive and handle reasons in dispatch

* top-down: remove previous top-down code

* TopDown: add MemReqSource enum

* TopDown: extend mshr_latency range

* TopDown: add basic Req Source

TODO: distinguish prefetch

* dcache: distinguish L1DataPrefetch and CPUData

* top-down: comment out debugging perf counters in ibuffer

* TopDown: add path to pass MemReqSource to HuanCun

* TopDown: use simpler logic to count reqSource and update Probe count

* frontend: update topdown counters

* Update HuanCun Topdown for MemReqSource

* top-down: fix load stalls

* top-down: Change the priority of different stall reasons

* top-down: breakdown OtherCoreStall

* sbuffer: fix eviction

* when valid count reaches StoreBufferSize, do eviction

* sbuffer: fix replaceIdx

* If the way selected by the replacement algorithm cannot be written into dcache, its result is not used.

* dcache, ldu: fix vaddr in missqueue

This commit prevents the high bits of the virtual address from being truncated

* fix-ldst_pri-230506

* mainpipe: fix loadsAreComing

* top-down: disable dedup

* top-down: remove old top-down config

* top-down: split lq addr from ls_debug

* top-down: purge previous top-down code

* top-down: add debug_vaddr in LoadQueueReplay

* add source rob_head_other_repay

* remove load_l1_cache_stall_with/wihtou_bank_conflict

* dcache: split CPUData & refill latency

* split CPUData to CPUStoreData & CPULoadData & CPUAtomicData
* monitor refill latency for all type of req

* dcache: fix perfcounter in mq

* io.req.bits.cancel should be applied when counting req.fire

* TopDown: add TopDown for CPL2 in XiangShan

* top-down: add hartid params to L2Cache

* top-down: fix dispatch queue bound

* top-down: no DqStall when robFull

* topdown: buspmu support latency statistic (#2106)

* perf: add buspmu between L2 and L3, support name argument

* bump difftest

* perf: busmonitor supports latency stat

* config: fix cpl2 compatible problem

* bump utility

* bump coupledL2

* bump huancun

* misc: adapt to utility key&field

* config: fix key&field source, remove deprecated argument

* buspmu: remove debug print

* bump coupledl2&huancun

* top-down: fix sq full condition

* top-down: classify "lq full" load bound

* top-down: bump submodules

* bump coupledL2: fix reqSource in data path

* bump coupledL2

---------
Co-authored-by: Ntastynoob <934348725@qq.com>
Co-authored-by: NGuokai Chen <chenguokai17@mails.ucas.ac.cn>
Co-authored-by: Nlixin <1037997956@qq.com>
Co-authored-by: NXiChen <chenxi171@mails.ucas.ac.cn>
Co-authored-by: NZhou Yaoyang <shinezyy@qq.com>
Co-authored-by: NLyn <lyn@Lyns-MacBook-Pro.local>
Co-authored-by: Nwakafa <wangkaifan@ict.ac.cn>

d2b20d1a

hint: add CustomHint interface (#2111) · b9e121df

由 happy-lx 提交于 6月 02, 2023

* hint: add CustomHint interface

* dcache: fix replacement & mshrId update

* access replacement only once per load
* update mshrId in replayqueue only when this load enters mshr

* replay: block cache miss load

* block cache miss load until hint or dcache refill appears

* buffer: fix hint buffer depth to 1

* ldu: add dcache miss l2hint fast replay path

* bump coupledL2

* bump utility

---------
Co-authored-by: NLyn <lyn@Lyns-MacBook-Pro.local>
Co-authored-by: Nwangkaifan <wangkaifan@ict.ac.cn>

b9e121df

31 5月, 2023 1 次提交
- W
  
  bump coupledL2 (#2108) · 2c1a69a0
  由 wakafa 提交于 5月 31, 2023
  
  2c1a69a0
30 5月, 2023 3 次提交

S
ldu: add load fast replay path (#2105) · 594c5198
由 sfencevma 提交于 5月 30, 2023
```
Co-authored-by: NLyn <lyn@Lyns-MacBook-Pro.local>
```
594c5198
util: fix constant assert and error (#2098) · 36414dd2
由 Maxpicca 提交于 5月 30, 2023

36414dd2

LQ: fix select oldest inst & remove bank conf. block to avoid deadlock (#2100) · f2e8d419

由 sfencevma 提交于 5月 30, 2023

* LoadQueueReplay: fix worst case, all oldest instructions are allocated to the same bank,
and the number of instructions is greater than the number of stages in load unit.
* Remove bank conflict block
* Increase priority for data replay

The deadlock scenario is as follows:

The LoadQueueReplay entry will not be released immediately after the instruction
is replayed from LoadQueueReplay. For example, after instruction a is replayed from
LoadQueueReplay, entry 1 is still valid. If instruction a still needs to be replayed,
Entry 1 will be updated again, otherwise entry 1 can be released.

If only the time of the first enqueue is used to select replay instructions (age matrix),
when there are too many instructions (in LoadQueueReplay) to be replay, some
instructions may not be selected.

Using the pointer ldWbPtr of the oldest instruction, when the saved lqIdx of the
instruction is equal to ldWbPtr and can be replayed, LoadQueueReplay will give
priority to the instruction instead of using the selection result of the age matrix.
To select older instructions, LoadQueueReplay will calculate pointers such as
ldWbPtr, ldWbPtr+1, ldWbPtr+2, ldWbPtr+3..., and if the lqIdx of the instruction
is in these results, it will be selected first.

When the pointer is compared, there will be an n-bit long mask, and LoadQueueReplay
will be from 0 to n-1. When i th bit is valid, select i th instruction.

The stride of the pointer comparison is larger than the number of pipeline stages
of the load unit, and the selected instruction still needs to be replayed after the
first replay (for example, the data is not ready). Worse, in the bit of the mask
generated by pointer comparison, the instructions (lqIdx is ldWbPtr+1, ldWbPtr+2, ...)
after the oldest instruction (lqIdx is equal to ldWbPtr) are in the lower bit and the
oldest instruction is in the higher bit. It cannot select the oldest instruction.

f2e8d419

28 5月, 2023 1 次提交

lsu, mdp: using sq based SSID comparison instead of LFST (#2081) · 159372dd

由 sfencevma 提交于 5月 28, 2023

This commit provides MDP adaptation for #2077 

* fix mdp: disable LFST, ssing ssid comparison instead of LFST

* add loadWaitStrict when compare SSID

* fix store data wakeup logic
Co-authored-by: NLyn <lyn@Lyns-MacBook-Pro.local>

159372dd

26 5月, 2023 1 次提交
- W
  
  bump difftest (#2102) · 24f22b94
  由 wakafa 提交于 5月 26, 2023
  
  24f22b94
25 5月, 2023 2 次提交

Merge coupledL2 into master (#2064) · 15ee59e4

由 wakafa 提交于 5月 25, 2023

* icache: Acquire -> Get to L2

* gitmodules: add coupledL2 as submodule

* cpl2: merge coupledL2 into master

* Changes includes:
*   coupledL2 integration
*   modify user&echo fields in i$/d$/ptw
*   set d$ never always-releasedata
*   remove hw perfcnt connection for L2

* bump utility

* icache: remove unused releaseUnit

* config: minimalconfig includes l2

* Otherwise, dirty bits maintainence may be broken
* Known issue: L2 should have more than 1 bank to avoid compiling problem

* bump Utility

* bump coupledL2: fix bugs in dual-core

* bump coupledL2

* icache: set icache as non-coherent node

* bump coupledL2: fix dirty problem in L2 ProbeAckData

---------
Co-authored-by: Nguohongyu <20373696@buaa.edu.cn>
Co-authored-by: NXiChen <chenxi171@mails.ucas.ac.cn>

15ee59e4

W
script: enable chiseldb by default on running emu by xiangshan.py (#2091) · e3cd2c1f
由 wakafa 提交于 5月 25, 2023
```
* script: enable chiseldb by default on running emu by xiangshan.py

* script: move db file to wave_home if emu failed
```
e3cd2c1f

24 5月, 2023 2 次提交
- S
  
  Update XSTile.scala (#2088) · a1c09046
  由 sfencevma 提交于 5月 24, 2023
  
  a1c09046
- S
  Merge pull request #2086 from OpenXiangShan/kmh-bpu-history-checker · 1a7703ac
  由 Steve Gou 提交于 5月 24, 2023
```
BPU: online history checker
```
  1a7703ac
23 5月, 2023 7 次提交
- E
  
  bpu: history checker switch and code style · ab0200c8
  由 Easton Man 提交于 5月 21, 2023
  
  ab0200c8
- E
  
  bpu: use warn instead of error when checker disagree · 65c5c719
  由 Easton Man 提交于 5月 21, 2023
  
  65c5c719
- E
  
  bpu: add br_committed to update data path · cc2d1573
  由 Easton Man 提交于 5月 21, 2023
  
  cc2d1573
- E
  
  bpu: fix checker history maintainence in various condition · 200d06cc
  由 Easton Man 提交于 5月 21, 2023
  
  200d06cc
- E
  
  bpu: fix history shift source · 94a3f0aa
  由 Easton Man 提交于 5月 18, 2023
  
  94a3f0aa
- E
  
  bpu: impl a history checker · 09d0c404
  由 Easton Man 提交于 5月 18, 2023
  
  09d0c404
- S
  lsu, uncache buffer: fix uncache buffer writeback loadOut is incorrectly held (#2087) · cea46230
  由 sfencevma 提交于 5月 23, 2023
```
* fix uncache buffer writeback fsm

* fix uncache buffer writeback fsm

* fix uncache buffer writeback control

---------
Co-authored-by: NLyn <lyn@Lyns-MacBook-Pro.local>
```
  cea46230
21 5月, 2023 1 次提交

lsu: split lq for larger ooo load window (#2077) · e4f69d78

由 sfencevma 提交于 5月 21, 2023

BREAKING CHANGE: new LSU/LQ architecture introduced in this PR

In this commit, we replace unified LQ with:
* virtual load queue
* load replay queue
* load rar queue
* load raw queue
* uncache buffer

It will provide larger ooo load window.

NOTE: IPC loss in this commit is caused by MDP problems, for previous MDP
does not fit new LSU architecture. 
MDP update is not included in this commit, IPC loss will be fixed by MDP update later.

---------
Co-authored-by: NLyn <lyn@Lyns-MacBook-Pro.local>

e4f69d78

16 5月, 2023 1 次提交

dcache: replace prefer invalid ways, disable replace update on 2nd miss replay (#2055) · 282f71c4

由 happy-lx 提交于 5月 16, 2023

* When replacing happens in loadpipe and mainpipe and there are invalid ways, use invalid ways first instead of way calulated by replacer.
* Update replacement on 2nd miss only when this request is firstly issued.

* dcache: prefer using invalid way when replace

When replacing happens in loadpipe and mainpipe and there are invalid
ways, use these ways first instead of way calulated by replacer

* dcache: fix replacement

If a request is merged by dcache, update replacement only when this
request is firstly issued

* loadpipe: fix compile

* ldu: fix s1_repl_way_en

282f71c4

15 5月, 2023 2 次提交
- S
  Merge pull request #2062 from OpenXiangShan/tage-cond-fix · 040573ab
  由 Steve Gou 提交于 5月 15, 2023
```
ITTAGE: fix missing base cond
```
  040573ab
- S
  Merge pull request #2060 from Guo-HY/fdip-icache-migrate · 0277fa67
  由 Steve Gou 提交于 5月 15, 2023
```
ICache FDIP migrate
```
  0277fa67
10 5月, 2023 3 次提交
- dcache: parameterized sram org according to whether to use wpu (#2059) · 7dbf3a33
  由 Maxpicca 提交于 5月 10, 2023
```
* add a switch for the WPU in dataArray

* dcache: fix cacheop dup logic

* dcache: fix wpu parameter
```
  7dbf3a33
- M
  
  lsu: fix no-translate bug of L1D prefetch datapath (#2074) · 57fe673e
  由 Ma-YX 提交于 5月 10, 2023
  
  57fe673e
- G
  
  ITTAGE: fix missing base cond · 3cc8e5ca
  由 Guokai Chen 提交于 4月 27, 2023
  
  3cc8e5ca
09 5月, 2023 1 次提交

Fix constant (#2071) · 047e34f9

由 Maxpicca 提交于 5月 09, 2023

* constant: fix dead loop

* util: fix constant dynamic switch

* util: fix constant

047e34f9

OpenXiangShan / XiangShan 9 个月 前同步成功

OpenXiangShan / XiangShan
9 个月前同步成功