GTX Titan X Pascal

Seller Note “No signs of life”

Summary

  • Lovely looking card, really like it!
  • resistances ok
  • Vcore – 0.3Ω
  • Vmem – 65Ω
  • PEX – 48.6Ω
  • 1.8v – 800Ω
  • 5v – Ω
  • Voltages all present
  • Power-on testing reveals the description is not accurate, fan and led works, monitor backlight comes on.
  • MATS test reveals two failing chips D0, E1. Hmm.. both in the same corner, hope it isn’t a memory controller issue.

mats version 400.184.  Testing GP102 with 20 MB of memory starting with 0 MB.

Read    Error Count: 0
Write   Error Count: 2630944
Unknown Error Count: 0

=== MEMORY ERRORS BY SUBPARTITION ===
SUBPART READ ERRORS WRITE ERRORS UNKNOWN ERRS
------- ----------- ------------ ------------
FBIOA0            0            0            0
FBIOA1            0            0            0
FBIOB0            0            0            0
FBIOB1            0            0            0
FBIOC0            0            0            0
FBIOC1            0            0            0
FBIOD0            0      1315264            0
FBIOD1            0            0            0
FBIOE0            0            0            0
FBIOE1            0      1315680            0
FBIOF0            0            0            0
FBIOF1            0            0            0

Failing Bits: 
   D000 D001 D002 D003 D004 D005 D006 D007 D008 D009 D010 D011 D012 D013 D014 D015 
D016 D017 D018 D019 D020 D021 D022 D023 D024 D025 D026 D027 D028 D029 D030 D031 
E032 E033 E034 E035 E036 E037 E038 E039 E040 E041 E042 E043 E044 E045 E046 E047 
E048 E049 E050 E051 E052 E053 E054 E055 E056 E057 E058 E059 E060 E061 E062 E063 



=== MEMORY ERRORS BY BIT ===
P : Partition (FBIO)
                                            READ 0 READ 1 READ ?
P BIT READ ERRORS WRITE ERRORS UNKNOWN ERRS EXP. 1 EXP. 0 EXP. ?
- --- ----------- ------------ ------------ ------ ------ ------
D 000           0       441851            0    111 441740      0
D 001           0       873356            0     70 873286      0
D 002           0       441828            0     68 441760      0
D 003           0       873393            0     44 873349      0
D 004           0       441876            0    156 441720      0
D 005           0       873378            0     71 873307      0
D 006           0       441843            0     91 441752      0
D 007           0       873384            0     52 873332      0
D 008           0       657652            0 436774 220878      0
D 009           0       657614            0 220914 436700      0
D 010           0       711589            0 545911 165678      0
D 011           0       657615            0 220911 436704      0
D 012           0       657639            0 436751 220888      0
D 013           0       792511            0  82868 709643      0
D 014           0       657616            0 436716 220900      0
D 015           0       576695            0 303742 272953      0
D 016           0       631223            0 383278 247945      0
D 017           0       740138            0 136578 603560      0
D 018           0       575224            0 269962 305262      0
D 019           0       740141            0 136580 603561      0
D 020           0       631223            0 383278 247945      0
D 021           0       441806            0 441772     34      0
D 022           0       873244            0 873225     19      0
D 023           0       684115            0 193884 490231      0
D 024           0       683848            0 489737 194111      0
D 025           0       873458            0     20 873438      0
D 026           0       630811            0 382445 248366      0
D 027           0       684341            0 193368 490973      0
D 028           0       873454            0 873436     18      0
D 029           0       441806            0 441772     34      0
D 030           0       873454            0 873436     18      0
D 031           0       441806            0 441772     34      0
E 032           0       855796            0 835192  20604      0
E 033           0       775152            0 409492 365660      0
E 034           0       869881            0 832995  36886      0
E 035           0       786472            0 409472 377000      0
E 036           0       873472            0 873472      0      0
E 037           0       873472            0      0 873472      0
E 038           0       873472            0 873472      0      0
E 039           0       442208            0 442208      0      0
E 040           0       442208            0      0 442208      0
E 041           0       873472            0      0 873472      0
E 042           0       442208            0      0 442208      0
E 043           0       197467            0  35065 162402      0
E 044           0       442208            0      0 442208      0
E 045           0       261886            0  83600 178286      0
E 046           0       442208            0      0 442208      0
E 047           0       873472            0      0 873472      0
E 048           0       442208            0      0 442208      0
E 049           0       530435            0 427947 102488      0
E 050           0       529714            0 501393  28321      0
E 051           0       499395            0 429493  69902      0
E 052           0       526267            0 501708  24559      0
E 053           0       579458            0 433783 145675      0
E 054           0       442208            0      0 442208      0
E 055           0       873472            0      0 873472      0
E 056           0       665009            0 621029  43980      0
E 057           0       825773            0 124362 701411      0
E 058           0       666817            0 618999  47818      0
E 059           0       646190            0 311535 334655      0
E 060           0       669404            0 621290  48114      0
E 061           0       489907            0 317846 172061      0
E 062           0       674708            0 621327  53381      0
E 063           0       825773            0 124362 701411      0


=== MEMORY ERRORS BY ADDRESS ===
ADDRESS : Failing memory address, or buffer offset if starting with 'X+'
T : Type of memory error: W = write, R = read
P : Partition (FBIO)
S : Subpartition
B : Bank
E : Beat
U : PseudoChannel
   ADDRESS EXPECTED   ACTUAL  REREAD1  REREAD2 FAILBITS TPSBEU  ROW COL                                                                     BIT(s)
   ------- --------   ------  -------  ------- -------- ------  --- ---                                                                     ------
000135fcbc 00000000 75ff48ff 75ff48ff 75ff48ff 75ff48ff WD0860 0019 065      D000,D001,D002,D003,D004,D005,D006,D007,D008,D010,D011,D012,D013,D014
000135fcb8 00000000 d2ffe9ff d2ffe9ff d2ffe9ff d2ffe9ff WD0840 0019 065 D000,D001,D002,D003,D004,D005,D006,D007,D008,D009,D011,D012,D013,D014,D015
000135fcb4 00000000 beff2aff beff2aff beff2aff beff2aff WD0820 0019 065      D000,D001,D002,D003,D004,D005,D006,D007,D009,D010,D011,D012,D013,D015
000135fcb0 00000000 25ff23ff 25ff23ff 25ff23ff 25ff23ff WD0800 0019 065                D000,D001,D002,D003,D004,D005,D006,D007,D008,D009,D010,D013
000135fcac 00000000 b2ff1eff b2ff1eff b2ff1eff b2ff1eff WD0860 0019 064      D000,D001,D002,D003,D004,D005,D006,D007,D009,D010,D011,D012,D013,D015
000135fca8 00000000 7dff64ff 7dff64ff 7dff64ff 7dff64ff WD0840 0019 064      D000,D001,D002,D003,D004,D005,D006,D007,D008,D010,D011,D012,D013,D014
000135fca4 00000000 21ff6bff 21ff6bff 21ff6bff 21ff6bff WD0820 0019 064           D000,D001,D002,D003,D004,D005,D006,D007,D008,D009,D011,D013,D014
000135fca0 00000000 bbff70ff bbff70ff bbff70ff bbff70ff WD0800 0019 064 

......


000041170c 00000000 b2ff1eff b2ff1eff b2ff1eff b2ff1eff WD0e60 0005 018      D000,D001,D002,D003,D004,D005,D006,D007,D009,D010,D011,D012,D013,D015
0000411708 00000000 7dff64ff 7dff64ff 7dff64ff 7dff64ff WD0e40 0005 018      D000,D001,D002,D003,D004,D005,D006,D007,D008,D010,D011,D012,D013,D014
0000411704 00000000 21ff6bff 21ff6bff 21ff6bff 21ff6bff WD0e20 0005 018           D000,D001,D002,D003,D004,D005,D006,D007,D008,D009,D011,D013,D014
0000411700 00000000 bbff70ff bbff70ff bbff70ff bbff70ff WD0e00 0005 018 D000,D001,D002,D003,D004,D005,D006,D007,D008,D009,D011,D012,D013,D014,D015
000079c57c 00000000 0f0e0f9f 0f0e0f9f 0f0e0f9f 0f0e0f9f WD0d61 000a 023                          D016,D017,D018,D019,D020,D023,D024,D025,D026,D027
000079c578 00000000 0f0e0f0e 0f0e0f0e 0f0e0f0e 0f0e0f0e WD0d41 000a 023                                         D017,D018,D019,D024,D025,D026,D027
000079c574 00000000 0f0e0f91 0f0e0f91 0f0e0f91 0f0e0f91 WD0d21 000a 023                          D016,D017,D018,D019,D020,D023,D024,D025,D026,D027
000079c570 00000000 0f0e0291 0f0e0291 0f0e0291 0f0e0291 WD0d01 000a 023                          D016,D017,D018,D019,D020,D023,D024,D025,D026,D027
000079c56c 00000000 020e029f 020e029f 020e029f 020e029f WD0d61 000a 022                                         D016,D017,D018,D019,D020,D023,D025
000079c568 00000000 020e0e91 020e0e91 020e0e91 020e0e91 WD0d41 000a 022                               D016,D017,D018,D019,D020,D023,D025,D026,D027
000079c564 00000000 0291029f 0291029f 0291029f 0291029f WD0d21 000a 022                                         D016,D017,D018,D019,D020,D023,D025
000079c560 00000000 0e91029f 0e91029f 0e91029f 0e91029f WD0d01 000a 022                               D016,D017,D018,D019,D020,D023,D025,D026,D027
000079c55c 00000000 75ff48ff 75ff48ff 75ff48ff 75ff48ff WD0d60 000a 023      D000,D001,D002,D003,D004,D005,D006,D007,D008,D010,D011,D012,D013,D014
000079c558 00000000 d2ffe9ff d2ffe9ff d2ffe9ff d2ffe9ff WD0d40 000a 023 D000,D001,D002,D003,D004,D005,D006,D007,D008,D009,D011,D012,D013,D014,D015
000079c554 00000000 beff2aff beff2aff beff2aff beff2aff WD0d20 000a 023      D000,D001,D002,D003,D004,D005,D006,D007,D009,D010,D011,D012,D013,D015
000079c550 00000000 25ff23ff 25ff23ff 25ff23ff 25ff23ff WD0d00 000a 023                D000,D001,D002,D003,D004,D005,D006,D007,D008,D009,D010,D013
000079c54c 00000000 b2ff1eff b2ff1eff b2ff1eff b2ff1eff WD0d60 000a 022      D000,D001,D002,D003,D004,D005,D006,D007,D009,D010,D011,D012,D013,D015
000079c548 00000000 7dff64ff 7dff64ff 7dff64ff 7dff64ff WD0d40 000a 022      D000,D001,D002,D003,D004,D005,D006,D007,D008,D010,D011,D012,D013,D014
000079c544 00000000 21ff6bff 21ff6bff 21ff6bff 21ff6bff WD0d20 000a 022           D000,D001,D002,D003,D004,D005,D006,D007,D008,D009,D011,D013,D014
000079c540 00000000 bbff70ff bbff70ff bbff70ff bbff70ff WD0d00 000a 022 D000,D001,D002,D003,D004,D005,D006,D007,D008,D009,D011,D012,D013,D014,D015
000079857c 00000000 75ff48ff 75ff48ff 75ff48ff 75ff48ff WD0d60 000a 013      D000,D001,D002,D003,D004,D005,D006,D007,D008,D010,D011,D012,D013,D014
0000798578 00000000 d2ffe9ff d2ffe9ff d2ffe9ff d2ffe9ff WD0d40 000a 013 D000,D001,D002,D003,D004,D005,D006,D007,D008,D009,D011,D012,D013,D014,D015
0000798574 00000000 beff2aff beff2aff beff2aff beff2aff WD0d20 000a 013      D000,D001,D002,D003,D004,D005,D006,D007,D009,D010,D011,D012,D013,D015
0000798570 00000000 25ff23ff 25ff23ff 25ff23ff 25ff23ff WD0d00 000a 013                D000,D001,D002,D003,D004,D005,D006,D007,D008,D009,D010,D013
If you are getting failure for first MB of FB then try option -no_scan_out
Error Code = 00000001 

                                        
 #######     ####    ########  ###      
 #######    ######   ########  ###      
 ##        ##    ##     ##     ###      
 ##        ##    ##     ##     ###      
 #######   ########     ##     ###      
 #######   ########     ##     ###      
 ##        ##    ##     ##     ###      
 ##        ##    ##  ########  ######## 
 ##        ##    ##  ########  ######## 
                                        

This chip is D9TXS or MT58K256M32JA-100:A 

I have a feeling this isn’t going to be some routine memory replacement, all bits are failing in both chips, possibly some kind of power or memory controller (GPU) issue in the worst case.

Attempted replacement of E1:

  • Wasn’t easy to remove the chip, took several attempts and eventually 450 deg C hot air with pre-heater.
  • Replacement also seemed hard and I didn’t get it to the point of a ‘nudge test’ for fear of overheating. This is likely a mistake.
  • Trying a speculative test, as no short is present and the chip ‘appears’ to be on properly.
  • As expected, the test was not only unsuccessful, but MATS gave exactly the same report.
  • Need to regroup on this one I think and do some other measurements to try and work out if there is another issue. Hopefully some lessons will be learned even if the card never works..

Update 0309/2022

  • Two chips next to each other, different channels (D0, E1), all bits in both chips have errors
  • The concern here would be that the memory controller is at fault (two chips together)
  • Possibly there are broken connections under one corner of the core
  • I could try checking the voltages, would be somewhat lucky if it was anything like this.
  • The corner of the core closest to the chips does look a little raised – possible reflow/reball opportunity? However, would be good to confirm this is an issue.